Introduction
Recently I noticed that there is little information on the development of kernel modules in Habré. Everything I found:
It has always been surprising that people who are more or less familiar with C are afraid and even avoid reading the nuclear code, as if it consists of 60% of an assembler (which in fact is also not so complicated). Actually, I plan to write a series of articles devoted to the development or refinement of existing netfilter and iptables modules.
I hope they will be interesting for novice kernel developers, driver makers or just people who want to try themselves in a new area of ​​development.
')
What do we do
As stated in the title of the article - we will write a simple iptables module based on xt_string. Xt_string is a netfilter module that can search for a sequence of bytes in a packet. However, he, in my opinion, lacks the ability to search for several sequences of bytes in a given order. Well, since the GPL license, what prevents him from giving this opportunity?
Actually, in this article we will write down such a module, we will call it xt_wildstring, which can be used
for thick PR as follows:
iptables -I FORWARD -p tcp --dport 80 --tcp-flags ACK,PSH ACK,PSH -m wildstring --wildstring "reductor*price*carbonsoft.ru" -j DROP.
I will start writing the article simultaneously with the start of development.
Immediately it is worth noting that this module was not written for production, but only as a simple example that allows you to quickly arrange the process of developing and testing kernel modules, as well as get to know a little deeper with netfilter.
Briefly about the device netfilter and iptables
As a rule, the iptables module consists of two parts - the kernelspace and userspace. The kernelspace contains a Linux kernel module that can be dynamically loaded and used. This is what works with packets when we add a rule to iptables. In userspace is already a module iptables, which allows you to create rules and transfer them to the Linux kernel.
Netfilter modules can be divided into three categories:
- Hooks are essentially default chains and tables that are substituted into the package path through the kernel.
- Matches - modules that return true or false, allow you to use conditions, for example, determine which protocol the packet belongs to
- Targets are modules that produce some kind of action on a package, the most famous are ACCEPT / DROP, although in fact there are many more
Where in the source are these modules:
Netfilter is part of the Linux kernel source and in version 2.6.32 it is located in several directories:
/ usr / src / linux / net / netfilter / - most of the match modules.
/ usr / src / linux / net / ipv4 / netfilter / is part of the target modules.
/ usr / src / linux / include / linux / netfilter / - headers and those and other modules.
Iptables modules are located in the directory
/ usr / src / iptables / extensions /
The headers of the kernelspace and userspace modules must be the same, so it’s better if this is a single file.
Now let's move from theory to practice
We will not reinvent the wheel, not for the GPL come up with. Take the xt_string module from the latest CentOS 6 kernel, as one of the most stable at the moment.
There was a lot of information about setting up the module assembly system and the stand, so I hid it under the spoiler. If there is a misunderstanding or interest in where and what is collected, started and tested - it makes sense to look under it.
Settings of the assembly system and test bench.Preparing the build and debug system
Yes, many dream of a convenient IDE for developing Linux Kernel. But, alas, I did not find anything worthwhile. One of the reasons for this is relatively simple - in the case of a segfolt in the kernel, we get Kernel Panic and spend a lot of time rebooting if panic happens on our working machine. Therefore, development is usually carried out in a virtual machine, or on a separate stand, if the code is written for a specific hardware. However, our module is universal, so we install virtuals.
We put CentOS on two virtual machines
Actually so that our brain does not stand idle during Kernel Panic in case of failures, and they are guaranteed to be, we proceed as follows. Install two virtual machines that will have access to the Internet and to each other. One will be the collector of the module, and the second stand for verification.
On the collector we get the source code of linux and iptables
By the way, at the collector, we will need some good and useful programs.
yum install git ncurses-devel make gcc rpm-build indent
Now we add to our bookmarks one of the most useful repositories for a person developing under CentOS:
http://vault.centos.org/6.4/os/Source/SPackages/From here we will take src.rpm of the linux kernel and iptables.
rpm -i http://vault.centos.org/6.4/os/Source/SPackages/kernel-2.6.32-358.el6.src.rpm rpm -i http://vault.centos.org/6.4/os/Source/SPackages/iptables-1.4.7-9.el6.src.rpm
Then go to / root / rpmbuild / SPECS / and deploy the source code with the imposition of patches from CentOS.
rpmbuild -bp iptables.spec rpmbuild -bp kernel.spec
In / root / rpmbuild / BUILD / we will have folders with the sources of the Linux kernel and iptables.
Now we need to build the entire kernel at least once in order to be able to rebuild only the net / netfilter / folder when making changes to our module. For convenience and familiarity, we will make symlinks:
ln -s /root/rpmbuild/BUILD/kernel-2.6.32-358.el6/linux-2.6.32-358.el6.x86_64/ /usr/src/linux ln -s /root/rpmbuild/BUILD/iptables-1.4.7/ /usr/src/iptables/
Go to / usr / src / linux. First, let's generate a config.
make menuconfig
Save it and collect the whole core. By the way, maybe on rpmbuild or make there will be a hang on gpg: keyring `./pubring.gpg 'created. To avoid this, let's say that random with us is urandom.
rm -f /dev/random ln -s /dev/urandom /dev/random
And the actual assembly:
make prepare make -j 3 make modules_install
In general, it would be nice to have the source code of the module store everything in the GIT repository, I have it located in ~ / GIT / wildstring /.
Stand reboot with kernel panic
You can do it in two ways, in my opinion, the most correct one is to set the / proc / sys / kernel / panic parameter to 2. But panic output is important for us, so if necessary, you can use the script on the host system in the spirit:
name=centos_test ip=<ip_> while true; do if ! ping -qc 1 $ip; then virt-viewer $name sleep 2 scrot virsh destroy $name virsh start $name sleep 60 fi done
Check module operation
Which can be used like this:
One-time start:
./test_wildstring.sh
Endless cycle:
./test_wildstring.sh while
Copy string from linux and iptables
We find the modules we need and copy them into our repository.
cp -v /usr/src/linux/net/netfilter/xt_string.c ~/GIT/wildstring/xt_wildstring.c mkdir -p ~/GIT/wildstring/include/linux/netfilter/ cp -v /usr/src/linux/include/linux/netfilter/xt_string.h ~/GIT/wildstring/include/linux/netfilter/xt_wildstring.h
Writing Makefile
We describe the assembly of the kernel module, the iptables module, as well as code alignment, cleanup of the working folder, and a couple more goals.
obj-m += xt_wildstring.o all: module lib module: cp include/linux/netfilter/xt_wildstring.h /usr/src/linux/include/linux/netfilter/xt_wildstring.h make -C /lib/modules/2.6.32/build M=$(PWD) modules lib: cp libxt_wildstring.c /usr/src//iptables/extensions cp include/linux/netfilter/xt_wildstring.h /usr/src/iptables/include/linux/netfilter/xt_wildstring.h make -C /usr/src/iptables/extensions cp /usr/src/iptables/extensions/libxt_wildstring.so libxt_wildstring.so userspace: gcc userspace_wildstring.c -o userspace ./userspace rm -f userspace install: scp xt_wildstring.ko root@10.90.140.160: scp libxt_wildstring.so root@10.90.140.160:/lib64/xtables-1.4.7/ clean: rm -f *~ *.ko *.so *.mod.c *.ko.unsigned *.o modules.order Module.symvers indent: Lindent *.c include/linux/netfilter/xt_wildstring.h
Comments on the Makefile:
- 2.6.32 - zahardkodili, since uname -r = 2.6.32-358.0.1.el6.x86_64, but I do not have these sources at hand, respectively, and the symlink / lib / modules/2.6.32-358.0.1. el6.x86_64 / build will not work.
- Since I am not a makefile guru, and did not come up with a beautiful and correct way to compile libxt_wildstring.so like xt_wildstring.ko, I decided not to bother and write this goal with simple bash commands.
- In order for the scp in the install target to work without a password, you need to generate SSH keys on the build system and drop them to the test bench.
- The Lindent command is copied from / usr / src / linux / scripts / Lindent to / usr / local / bin, as it is often used. I recommend to use it always when writing code in the Linux kernel, since with its charter we don’t go to another monastery. Better even before each commit.
We remove the extra in .gitignore
Untracked files in git status are a bit tense, so let's create ~ / GIT / wildstring / .gitignore:
* .o
* .so
. *
* .ko
* .ko.unsigned
modules.order
Module.symvers
* .mod.c
! .gitignore
Rename to wildstring
So that the module does not conflict with the original, it makes sense to rename it and all its functions from string to wildstring. An important point - you need to edit everything: the header, and the userspace module, and the kernelspace module. In this case, grep will save the father of Russian democracy:
grep -ri string xt_wildstring.c | grep -vi wildstring
Extending the match info structure
And again a bit of theory: each match-module has its own match-info structure, which is formed on the basis of the parameters passed from userspace. It is described in the header file (
xt_wildstring.h ).
The standard xt_string.h looks like this. #ifndef _XT_STRING_H #define _XT_STRING_H #include <linux/types.h> #define XT_STRING_MAX_PATTERN_SIZE 128 #define XT_STRING_MAX_ALGO_NAME_SIZE 16 enum { XT_STRING_FLAG_INVERT = 0x01, XT_STRING_FLAG_IGNORECASE = 0x02 }; struct xt_string_info { __u16 from_offset; // – . __u16 to_offset; // – . char algo[XT_STRING_MAX_ALGO_NAME_SIZE]; // . char pattern[XT_STRING_MAX_PATTERN_SIZE]; //, , . __u8 patlen; // , . union { struct { __u8 invert; // ! -m string –string “something” } v0; struct { __u8 flags; // . } v1; } u; /* Used internally by the kernel * . * , * java-? * , xml. */ struct ts_config __attribute__((aligned(8))) *config; }; #endif /*_XT_STRING_H*/
Multiply several fields of the xt_wildstring_info structure into xt_wildstring.h
To begin, add pointers to the substrings. It is pointers, not arrays of characters, as in the original, since the second and third pointers can be empty, that is, a template without asterisks will be passed to the module. By analogy, we add variables for them to store the length of substrings + according to the structure of the text search parameters in the package for each template. As a result, the structure began to look like this:
#ifndef _XT_WILDSTRING_H #define _XT_WILDSTRING_H #include <linux/types.h> #define XT_WILDSTRING_MAX_PATTERN_SIZE 128 #define XT_WILDSTRING_MAX_ALGO_NAME_SIZE 16 enum { XT_WILDSTRING_FLAG_INVERT = 0x01, XT_WILDSTRING_FLAG_IGNORECASE = 0x02 }; struct xt_wildstring_info { __u16 from_offset; __u16 to_offset; char algo[XT_WILDSTRING_MAX_ALGO_NAME_SIZE]; char pattern[XT_WILDSTRING_MAX_PATTERN_SIZE]; /* */ char *pattern_part1; char *pattern_part2; char *pattern_part3; __u8 patlen; /* */ __u8 patlen_part1; __u8 patlen_part2; __u8 patlen_part3; union { struct { __u8 invert; } v0; struct { __u8 flags; } v1; } u; /* Used internally by the kernel */ /* */ struct ts_config __attribute__((aligned(8))) *config; struct ts_config __attribute__((aligned(8))) *config_part1; struct ts_config __attribute__((aligned(8))) *config_part2; struct ts_config __attribute__((aligned(8))) *config_part3; }; #endif
Start using new header fields
Go to
xt_wildstring.c .
Now what we added to the header is time to use. To begin with, we will bring up the preparation and destruction of search configs.
Here again a bit of theory - as a rule, the structure of a match-module contains the following functions and structures:
- init - initialization of the module when it is loaded;
- exit - destruction of the module when it is loaded;
- mt is a packet checking function;
- mt_check is a function that checks the correctness of a module call when a rule is added;
- mt_destroy is a function that cleans up resources when a rule is deleted;
- mt_reg - structure of pointers to mt_check, mt and mt_destroy + additional information about the module;
In the original xt_string, the rule is added and deleted as follows:
In string_mt_check (appendix), the ts_config structure is generated based on the string and the search algorithm, (ts - text search). The package search function (skb_find_text) uses it as a parameter. The memory cleared by this structure (the string_mt_destroy function) is cleared by the textsearch_destroy function, called when the rule is removed from the chain.
Add a pair of textsearch_prepare to xt_wildstring_check
Before changing something, we will comment out the original wildstring_mt function, which actually checks the package as it passes through the rule, because the changes should be made little by little, but this function depends very much on them, but so far we don’t matter.
static bool wildstring_mt(const struct sk_buff *skb, const struct xt_match_param *par) { return false; #if 0 ... #endif }
First, let's prepare our ts_conf in the xt_wildstring_check function, which is called when the rule is added to iptables. Copy the pointer to the beginning of the line into a temporary variable, and we will go through it with the function strsep, which splits the line according to a given set of characters. If the token is found, we calculate its length and use it to prepare text search parameters.
s = (char *) conf->pattern; conf->pattern_part1 = strsep(&s, delim); if (!conf->pattern_part1) return false;
The next two ts_confs are filled in by analogy, with the only difference that if the pattern pointer is empty, then this is no longer an error, and we return true, that is, we work with a smaller number of patterns.
And destroy them in wildstring_mt_destroy
This function is called when the rule is deleted from iptables. To destroy the parameters when deleting a rule, we duplicate the destroy.
static void wildstring_mt_destroy(const struct xt_mtdtor_param *par) { struct xt_wildstring_info *conf = WILDSTRING_TEXT_PRIV(par->matchinfo); if (conf->pattern_part1) textsearch_destroy(conf->config_part1); if (conf->pattern_part2) textsearch_destroy(conf->config_part2); if (conf->pattern_part3) textsearch_destroy(conf->config_part3); }
Bring to mind match
And now the module began to successfully load-unload, and the rules are added-deleted, and no Kernel Panic. Now let's go back to the previously commented wildstring_mt function and add to it a search for all templates passed to the function.
First, we need a variable to save the length of the shift, in which we managed to find the necessary substring.
unsigned int skb_find = 0;
Generally not the most successful name, it would be much clearer to have something in the spirit of tmp_from_offset or wildstring_from_offset, but everything is already there in the githaba commits, so, alas, late. Now, instead of returning the result of the first search, we assign it to our new variable, analyze it and if nothing is found, we return false, and so on until we go through all the specified patterns.
memset(&state, 0, sizeof(struct ts_state)); skb_find = skb_find_text((struct sk_buff *)skb, conf->from_offset, conf->to_offset, conf->config_part1, &state); if (skb_find == UINT_MAX) return false;
And so we repeat for config_part2 and config_part3, with the difference that the presence of pattern_part2 and pattern_part3 should be checked and in case of absence - return true.
We achieve and check
Then we treat all compilation errors. In general, it is better to compile as often as possible, and with each logical completion, check the operation of the module in an infinite loop until the next part is added or we do not notice what happened to the kernel panic. It’s worth doing this because the cost of the error is much higher and it takes much longer between writing code and checking its full functionality than when writing most userspace utilities. That is why at the very beginning of the article so much attention is paid to the convenience of the assembly and debugging system on the stand, because, as everyone knows, no matter how good the thing is inside, if it is inconvenient to use, it will not be used.
We test on a couple of test cases using wget or curl. When creating a rule, it is important to remember that in the HTTP package GET is before HOST, and the template will have to be written a little backwards:
- "Something * html * example.com"
- "Pron * avi * yoursite"
- "Reductor * scheme * carbonsoft.ru"
That is, add the rule:
iptables -I OUTPUT -p tcp –dport 80 -m wildstring “reductor*scheme*carbonsoft” -j DROP
and try to download the page:
wget -t 1 -t 1
http://www.carbonsoft.ru/products/reductor/carbon-reductor/#schemeBingo - we broke off and iptables -nvL OUTPUT shows an increased packet count.
Why not lists?
An attentive and experienced
reader may exclaim, yes, there he will bite - they say why such perversions and crutches, when you can use lists and add / remove a structure consisting of pattern, patlen and config into it, and then go through this list for_each_entry. But - the purpose of the article is to show the device of the netfilter module, and working with lists in the linux kernel would add to the module one more additional entity that should be understood. Well, and besides, one must leave something to the reader for independent exercises.
Completion
Actually, we learned how to make kernel modules for netfilter, isn't that great?
In general, the module can be used not only for HTTP, but also for many other protocols, examples, perhaps, I will add later in the comments.
Sources can be taken in
the opensource section on our website .