I bring to your attention the translation of the article " Working with C unions in Rust FFI " by Herman J. Radtke III.
Note: This article assumes that the reader is familiar with Rust FFI , byte order (endianess) and ioctl .When creating bindings to the C code, we will inevitably encounter a structure that contains the union. In Rust there is no built-in support for associations, so we will have to work out a strategy on our own. In C, a union is a type that stores different types of data in the same memory area. There are many reasons why you can prefer union, such as: the conversion between binary representations of integers and floating point numbers, the implementation of pseudo-polymorphism and direct access to bits. I will focus on pseudo-polymorphism.
As an example, let's get a MAC address based on the interface name. We list the steps necessary to obtain it:
')
- Specify the type of request to be used with ioctl. If I want to get a MAC (or hardware) address, I specify SIOCGIFHWADDR.
- Write the interface name (something like eth0) in the ifr_name.
- Make a request using ioctl. As a result of a successful query, the data will be recorded in ifr_ifru.
If you are interested in details on obtaining a MAC address, look at these
instructions .
We need to use the function declared in C ioctl and pass the structure to thererereq. Looking in /usr/include/net/if.h, we see that ifreq is defined as follows:
struct ifreq { char ifr_name[IFNAMSIZ]; union { struct sockaddr ifru_addr; struct sockaddr ifru_dstaddr; struct sockaddr ifru_broadaddr; short ifru_flags; int ifru_metric; int ifru_mtu; int ifru_phys; int ifru_media; int ifru_intval; caddr_t ifru_data; struct ifdevmtu ifru_devmtu; struct ifkpi ifru_kpi; u_int32_t ifru_wake_flags; u_int32_t ifru_route_refcnt; int ifru_cap[2]; } ifr_ifru; }
Difficulties arise with the union of ifr_ifru. Looking at the possible types in ifr_ifru, we see that not all of them are the same size. short is two bytes, and u_int32_t is four. Further complicating the situation are several structures of unknown size. To write the correct code on Rust, it is important to find out the exact size of the ifreq structure. I created a small C program and found out that ifreq uses 16 bytes for ifr_name and 24 bytes for ifr_ifru.
Armed with the knowledge of the correct size of the structure, we can begin to represent it in Rust. One of the strategies is to create a specialized structure for all types of union.
#[repr(C)] pub struct IfReqShort { ifr_name: [c_char; 16], ifru_flags: c_short, }
We can use IfReqShort to request a SIOCGIFINDEX. This structure is smaller than the ifreq structure in C. Although we assume that only 2 bytes will be written, the external ioctl interface expects 24 bytes. For security, let's add 22 bytes of padding at the end:
#[repr(C)] pub struct IfReqShort { ifr_name: [c_char; 16], ifru_flags: c_short, _padding: [u8; 22], }
Then we will have to repeat this process for each type in the union. I find it somewhat tedious, since we will have to create a lot of structures and be very careful not to be mistaken with their size. Another way to represent a join is to have a buffer of raw bytes. We can make a single representation of the ifreq structure in Rust as follows:
#[repr(C)] pub struct IfReq { ifr_name: [c_char; 16], union: [u8; 24], }
This union buffer can hold any type of bytes. Now we can define methods for converting raw bytes to the desired type. We will avoid using unsafe code by not using transmute. Let's create a method to get the MAC address by converting the raw bytes to sockaddr C-type.
impl IfReq { pub fn ifr_hwaddr(&self) -> sockaddr { let mut s = sockaddr { sa_family: u16::from_be((self.data[0] as u16) << 8 | (self.data[1] as u16)), sa_data: [0; 14], };
This approach leaves us with one structure and method for converting raw bytes to the desired type. Looking again at our ifr_ifru union, we find that there are at least two other queries that also require the creation of sockaddr from raw bytes. Applying the principle of DRY, we can implement the private method IfReq to convert raw bytes to sockaddr. However, we can do better by abstracting the details of creating sockaddr, short, int, etc. from IfReq. All we need is to
tell the union that we need a certain type. Let's create an IfReqUnion for this:
#[repr(C)] struct IfReqUnion { data: [u8; 24], } impl IfReqUnion { fn as_sockaddr(&self) -> sockaddr { let mut s = sockaddr { sa_family: u16::from_be((self.data[0] as u16) << 8 | (self.data[1] as u16)), sa_data: [0; 14], };
We implemented methods for each of the types that make up the union. Now that our transformations are managed by IfReqUnion, we can implement the IfReq methods as follows:
#[repr(C)] pub struct IfReq { ifr_name: [c_char; IFNAMESIZE], union: IfReqUnion, } impl IfReq { pub fn ifr_hwaddr(&self) -> sockaddr { self.union.as_sockaddr() } pub fn ifr_dstaddr(&self) -> sockaddr { self.union.as_sockaddr() } pub fn ifr_broadaddr(&self) -> sockaddr { self.union.as_sockaddr() } pub fn ifr_ifindex(&self) -> c_int { self.union.as_int() } pub fn ifr_media(&self) -> c_int { self.union.as_int() } pub fn ifr_flags(&self) -> c_short { self.union.as_short() } }
As a result, we have two structures. First, IfReq, which represents the memory structure of ifreq in the C language. In it, we implement a method for each type of ioctl request. Second, we have IfRequnion, which manages the different types of ifr_ifru union. We will create a method for each type we need. This is less time consuming than creating a specialized structure for each type of union, and provides a better interface than type conversion in IfReq itself.
Here is a more complete
example ready. There is still a little work to do, but the tests pass, and the code described above is implemented in the code.
Be careful, this approach is not perfect. In the case of ifreq, we are lucky that ifr_name contains 16 bytes and is aligned with the word boundary. If ifr_name were not aligned to the four-byte word boundary, we would have a problem. The type of our association [u8; 24], which is aligned on the border of one byte. A type of 24 bytes would have a different alignment. Here is a brief example illustrating the problem. Suppose we have a C-structure containing the following union:
struct foo { short x; union { int; } y; }
This structure is 8 bytes in size. Two bytes for x, two more for alignment, and four bytes for y. Let's try to portray this in Rust:
#[repr(C)] pub struct Foo { x: u16, y: [u8; 4], }
The Foo structure is only 6 bytes in size: two bytes for x and the first two u8 elements placed in the same four-byte word as x. This subtle difference can cause problems when passing to the C function, which expects an 8-byte size structure.
Until Rust supports joins, it will be difficult to solve such problems correctly. Good luck, but be careful!