📜 ⬆️ ⬇️

Another virtual interface

In the previous article , a sketch of the code of the Linux kernel module was shown to create an additional virtual network interface. It was a simplified fragment of a real project that had worked for several years without failures and reclamations, so that it could well serve as a template for further improvement, correction and development.

But such an approach to implementation is, firstly, not the only one, and, secondly, in some situations it may be unacceptable (for example, in an embedded system with a kernel under 2.6.36, where there is no netdev_rx_handler_register () call yet). Below we will consider an alternative with the same functionality, but implementing it on a completely different layer of the TCP / IP network stack.

Network layer protocols


Quite a lot, not to repeat, it is written that the levels (layers) of the TCP / IP network stack do not clearly correspond to 7 levels of the OSI / ISO open systems interaction model (or, to be honest, the OSI model, which is close to the heart of academia, inadequate real-life TCP / IP network). The creation of a virtual interface, in the previous implementation discussed , was performed at the interface level (L2, Level 2 - very roughly corresponding to the OSI link level). The current implementation leverages the capabilities of the network layer (L3).

It is advisable to consider a certain minimum in relation to the network layer facilities, in the volume even slightly wider than necessary for the current task, for the possibilities of its subsequent expansion. At the network level of the network protocol stack (TCP / IP, but not only - all other protocol families are supported here, but today they seem to be of little relevance) processing of such protocols as IP / IPv4 / IPv6, IPX, IGMP, RIP, OSPF, ARP, or add original user protocols. To install the network layer handlers, the network layer API is provided (<linux / netdevice.h>):
struct packet_type { __be16 type; /* This is really htons(ether_type). */ struct net_device *dev; /* NULL is wildcarded here */ int (*func) (struct sk_buff*, struct net_device*, struct packet_type*, struct net_device*); ... struct list_head list; }; extern void dev_add_pack( struct packet_type *pt ); extern void dev_remove_pack( struct packet_type *pt ); 

In fact, in the core protocol modules we need to add a filter through which the socket buffers from the incoming interface stream pass (the outgoing stream is simpler, as was shown in the previous implementation). The dev_add_pack () function adds another new handler for packages of a given type, implemented by the func () function. The function adds but does not replace the existing handler (including the default handler of the Linux network system). Socket buffers that satisfy the criteria laid down in the struct packet_type structure (according to the protocol type and the dev network interface) are selected (fall into the function) for processing into the function.
')
Note: In the same way (installation of the filter function), new protocols are added and at the higher transport layer of the network stack (on which, for example, UDP, TCP, SCTP protocols are processed). All higher levels (more or less similar to OSI model levels) are not represented in the kernel, and are serviced in user space by BSD socket programming techniques. But all those related to higher levels, the details will no longer be considered in the text.

If we would like to add a new protocol (proprietary), we would have to override its type:
 #define PROTO_ID 0x1234 static struct packet_type test_proto = { __constant_htons( PROT_ID ), ... } 

The problem with this would be that the standard IP stack does not know such a protocol, and we will have to assume all of its processing. But our goal is only to override the processing of some packets, then for this we use the constant ETH_P_ALL, indicating that all protocols must pass through the filter (and if the dev field is NULL, then all network interfaces).

For comparison and concretization, we find a large number of protocol identifiers (Ethernet Protocol ID's) in <linux / if_ether.h>, here are some of them, for example:
 #define ETH_P_LOOP 0x0060 /* Ethernet Loopback packet */ #define ETH_P_IP 0x0800 /* Internet Protocol packet */ #define ETH_P_ARP 0x0806 /* Address Resolution packet */ #define ETH_P_PAE 0x888E /* Port Access Entity (IEEE 802.1X) */ #define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */ ... 

In this case, the type field is not an abstract numeric value in the program code, this value in binary form will be entered in the Ethernet header of the frame that is physically sent to the propagation medium:
 struct ethhdr { unsigned char h_dest[ETH_ALEN]; /* destination eth addr */ unsigned char h_source[ETH_ALEN]; /* source ether addr */ __be16 h_proto; /* packet type ID field */ } __attribute__((packed)); 

(We will need the same description in the code when filling in the struct packet_type structure in the module).

The filter function itself (the func field), which we still have to write, perhaps in the simplest form, is something like this:
 int test_pack_rcv( struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *odev ) { LOG( "packet received with length: %u\n", skb->len ); kfree_skb( skb ); return skb->len; }; 

The function is shown here mainly because of the obligatory call to kfree_skb (). He, in contrast to the seemingly close dev_kfree_skb () in the transmission channel, does not destroy the socket buffer, but only decrements its usage counter (users field). When installing each additional protocol filter by calling dev_add_pack (), this field of socket buffers will be incremented. You can install several network level filters (in the same or several loadable modules) and they will work all right in the reverse way of installing them, but each of them must execute kfree_skb (). Otherwise, you will have a slow but steady memory leak in the network stack, so its result, like a system crash, will be detected only after a few hours of continuous operation.

This is quite an interesting and not obvious place, so much so that it makes sense to digress and see the source code for the implementation of kfree_skb () (file net / core / skbuff.c):
 void kfree_skb(struct sk_buff *skb) { if (unlikely(!skb)) return; if (likely(atomic_read(&skb->users) == 1)) smp_rmb(); else if (likely(!atomic_dec_and_test(&skb->users))) return; trace_kfree_skb(skb, __builtin_return_address(0)); __kfree_skb(skb); } 

Calling kfree_skb () will actually free the socket buffer only in the case of skb-> users == 1, for all other values ​​it will only decrement skb-> users (usage counter).

Now we have enough details to organize the work of the virtual interface, but using, at this time, the network layer of the IP stack.

Virtual Interface Module


We proceed as before : create two module variants — a simplified version of virtl.ko, whose network interface (virt0) replaces the parent network interface, and the full version virt.ko, which analyzes network protocol frames (ARP and IP4) and affects only that traffic. which its interface refers to. The difference is that during the load of the simplified module, the work of the parent interface is temporarily stopped (until the module virtl.ko is unloaded), and when loading the full version both interfaces can work in parallel and independently. The code of the full module is noticeably more cumbersome, and it adds nothing to an understanding of the principles. Further, a simplified version showing the principles is considered in detail, and only later we will minimally touch the full version (its code and test protocol are given in the archive of examples):
the code is quite long here
 #include <linux/module.h> #include <linux/version.h> #include <linux/netdevice.h> #include <linux/etherdevice.h> #include <linux/inetdevice.h> #include <linux/moduleparam.h> #include <net/arp.h> #include <linux/ip.h> #define ERR(...) printk( KERN_ERR "! "__VA_ARGS__ ) #define LOG(...) printk( KERN_INFO "! "__VA_ARGS__ ) #define DBG(...) if( debug != 0 ) printk( KERN_INFO "! "__VA_ARGS__ ) static char* link = "eth0"; module_param( link, charp, 0 ); static char* ifname = "virt"; module_param( ifname, charp, 0 ); static int debug = 0; module_param( debug, int, 0 ); static struct net_device *child = NULL; static struct net_device_stats stats; //     static u32 child_ip; struct priv { struct net_device *parent; }; static char* strIP( u32 addr ) { //  IP    static char saddr[ MAX_ADDR_LEN ]; sprintf( saddr, "%d.%d.%d.%d", ( addr ) & 0xFF, ( addr >> 8 ) & 0xFF, ( addr >> 16 ) & 0xFF, ( addr >> 24 ) & 0xFF ); return saddr; } static int open( struct net_device *dev ) { struct in_device *in_dev = dev->ip_ptr; struct in_ifaddr *ifa = in_dev->ifa_list; /* IP ifaddr chain */ LOG( "%s: device opened", dev->name ); child_ip = ifa->ifa_address; netif_start_queue( dev ); if( debug != 0 ) { char sdebg[ 40 ] = ""; sprintf( sdebg, "%s:", strIP( ifa->ifa_address ) ); strcat( sdebg, strIP( ifa->ifa_mask ) ); DBG( "%s: %s", dev->name, sdebg ); } return 0; } static int stop( struct net_device *dev ) { LOG( "%s: device closed", dev->name ); netif_stop_queue( dev ); return 0; } static struct net_device_stats *get_stats( struct net_device *dev ) { return &stats; } //   static netdev_tx_t start_xmit( struct sk_buff *skb, struct net_device *dev ) { struct priv *priv = netdev_priv( dev ); stats.tx_packets++; stats.tx_bytes += skb->len; skb->dev = priv->parent; //    ()  skb->priority = 1; dev_queue_xmit( skb ); DBG( "tx: injecting frame from %s to %s with length: %u", dev->name, skb->dev->name, skb->len ); return 0; return NETDEV_TX_OK; } static struct net_device_ops net_device_ops = { .ndo_open = open, .ndo_stop = stop, .ndo_get_stats = get_stats, .ndo_start_xmit = start_xmit, }; //   int pack_parent( struct sk_buff *skb, struct net_device *dev, struct packet_type *pt, struct net_device *odev ) { skb->dev = child; //      stats.rx_packets++; stats.rx_bytes += skb->len; DBG( "tx: injecting frame from %s to %s with length: %u", dev->name, skb->dev->name, skb->len ); kfree_skb( skb ); return skb->len; }; static struct packet_type proto_parent = { __constant_htons( ETH_P_ALL ), //   : ETH_P_ARP & ETH_P_IP NULL, pack_parent, (void*)1, NULL }; int __init init( void ) { void setup( struct net_device *dev ) { //   ( GCC) int j; ether_setup( dev ); memset( netdev_priv( dev ), 0, sizeof( struct priv ) ); dev->netdev_ops = &net_device_ops; for( j = 0; j < ETH_ALEN; ++j ) //  MAC   dev->dev_addr[ j ] = (char)j; } int err = 0; struct priv *priv; char ifstr[ 40 ]; sprintf( ifstr, "%s%s", ifname, "%d" ); #if (LINUX_VERSION_CODE < KERNEL_VERSION(3, 17, 0)) child = alloc_netdev( sizeof( struct priv ), ifstr, setup ); #else child = alloc_netdev( sizeof( struct priv ), ifstr, NET_NAME_UNKNOWN, setup ); #endif if( child == NULL ) { ERR( "%s: allocate error", THIS_MODULE->name ); return -ENOMEM; } priv = netdev_priv( child ); priv->parent = dev_get_by_name( &init_net, link ); //   if( !priv->parent ) { ERR( "%s: no such net: %s", THIS_MODULE->name, link ); err = -ENODEV; goto err; } if( priv->parent->type != ARPHRD_ETHER && priv->parent->type != ARPHRD_LOOPBACK ) { ERR( "%s: illegal net type", THIS_MODULE->name ); err = -EINVAL; goto err; } memcpy( child->dev_addr, priv->parent->dev_addr, ETH_ALEN ); memcpy( child->broadcast, priv->parent->broadcast, ETH_ALEN ); if( ( err = dev_alloc_name( child, child->name ) ) ) { ERR( "%s: allocate name, error %i", THIS_MODULE->name, err ); err = -EIO; goto err; } register_netdev( child ); //    proto_parent.dev = priv->parent; dev_add_pack( &proto_parent ); //      LOG( "module %s loaded", THIS_MODULE->name ); LOG( "%s: create link %s", THIS_MODULE->name, child->name ); return 0; err: free_netdev( child ); return err; } void __exit virt_exit( void ) { struct priv *priv= netdev_priv( child ); dev_remove_pack( &proto_parent ); //    unregister_netdev( child ); dev_put( priv->parent ); free_netdev( child ); LOG( "module %s unloaded", THIS_MODULE->name ); LOG( "=============================================" ); } module_init( init ); module_exit( virt_exit ); MODULE_AUTHOR( "Oleg Tsiliuric" ); MODULE_LICENSE( "GPL v2" ); MODULE_VERSION( "3.7" ); 


Everything is quite transparent:

Here's how it works:


Expanding opportunities


Now, briefly, in two words, on how to make a full-fledged virtual interface that works only with its own traffic and does not disrupt the operation of the parent interface (what the full version of the module does in the archive). For this you need:


Using such a full-fledged module, you can open to the host, for example, two parallel SSH sessions on different interfaces (using different IP), which will in parallel actually use a single common physical interface:
 $ ssh olej@192.168.50.17 olej@192.168.50.17's password: Last login: Mon Jul 16 15:52:16 2012 from 192.168.1.9 ... $ ssh olej@192.168.56.101 olej@192.168.56.101's password: Last login: Mon Jul 16 17:29:57 2012 from 192.168.50.1 ... $ who olej tty1 2012-07-16 09:29 (:0) olej pts/0 2012-07-16 09:33 (:0.0) ... olej pts/6 2012-07-16 17:29 (192.168.50.1) olej pts/7 2012-07-16 17:31 (192.168.56.1) 


The last command shown (who) is executed already in an SSH session, that is, on the same remote host to which two independent connections from two different subnets are fixed (the last two lines of output), which actually represent one host, but from the point of view its various network interfaces.

Further clarification


In preparing and debugging examples of modules, for more details, this (fairly fresh) book was actively used: Rami Rosen: “Linux Kernel Networking: Implementation and Theory”, Apress, 650 pages, 2014, ISBN-13: 978-1-4302 -6196-4.


The author kindly provided it for free download even before the book was released for sale (2013-12-22). You can download it on this page .

Anyone who is interested in issues like the ones discussed in this article will be able to find in this edition a lot of ideas for the further development of the technology of own use of network interfaces.

The archive of codes mentioned in the text for experiments and further development can be found here or here .

Source: https://habr.com/ru/post/270517/


All Articles