📜 ⬆️ ⬇️

Linux, delayed driver loading and broken interrupts

Today I will talk about unexpected problems that have arisen when connecting a matrix keyboard to an ARM board running Linux in the Bercut-ETN device (ETN is a new hardware revision of Bercut-ET ). Specifically, about why the adp5589 driver did not want to receive interrupts and how we were able to get him to do it.

Who cares - welcome under cat.


')
The contents of the article:


Description of the iron around the keyboard:

The controller's keyboard itself is not - it is connected via the I2C bus using a special matrix keyboard controller - the adp5589 chip. The chip has an interrupt line wound up on one of the GPIO pins ARM SoCa. As a result, the wiring diagram looks like this:



portb is the port on the pin of which the interrupt from the keyboard controller is initiated;
intc - main interrupt controller;
i2c0 - i2c bus controller.

The driver adp5589 for some reason stubbornly does not want to get the interrupt number. What could be the cause of this behavior? Perhaps, there are not enough resources to load the keyboard driver. Maybe you did not have time to boot the device on which it depends? Let's look at which devices it may depend on:

First, from the I2C bus controller to which it is connected.
Secondly, from the port controller, to the pin of which we have an interrupt line.

Now let's see in what order the drivers of these devices are loaded:

gic
designware-i2c
adp5589
dw-apb-gpio-port

Aha That's the reason - when the keyboard driver is loaded, its interrupt-parent is not yet loaded. As a result, the keyboard driver does not receive the interrupt number. The standard solution to this problem is the mechanism of delayed loading of drivers.

Its essence is that the driver may require re-loading if any necessary resource is not yet available. And he can demand this by returning the value -EPROBE_DEFER from his probe function. Then this driver will be reloaded later. By that time, either the necessary resource will already be available, or the loading of the driver will be postponed again.

Add a check to the keyboard driver's probe function:

if (!client->irq) { dev_err(&client->dev, "no IRQ boss?\n"); return -EPROBE_DEFER; } 

Hoping to look at the new boot order:
gic
adp5589
designware-i2c
dw-apb-gpio-port
(deferred) adp5589
(deferred) adp5589
(deferred) adp5589

Something went wrong - the keyboard driver re-loaded after the GPIO driver, but did not receive an interrupt. It looks like you’ll have to dig deeper into the source code than expected.

This suggests three possible solutions:


First option:

The option is working, but not desirable. It is suitable as a temporary one, but if you change something in the hardware (for example, connect the interrupt output to another GPIO port), you will have to make changes not only in Device Tree, but also in the source code of the driver.

The second option:

Explicitly set the driver loading order is not possible. So this option is not suitable.

The third option:

The most correct. We will consider it.

Here, perhaps, it is worthwhile to briefly talk about such a thing as Device Tree, since then there will be references to it.

Device Tree is one of the forms for describing the hardware of the device on which we want to use Linux. It is presented in the form of a tree of nodes in which the necessary information is specified. DT exists as human-readable text files ( .dts ; .dtsi ) and a binary file ( .dtb ) collected from them.

For example, consider a slice of the .dts file describing the structure of connecting our keyboard controller to other SoCa devices.

 i2c0: i2c@ffc04000 { compatible = "snps,designware-i2c"; keybs@34 { compatible = "adi,adp5589"; interrupts = <19 IRQ_TYPE_LEVEL_LOW>; interrupt-parent = <&portb>; }; }; intc: intc@fffed000 { compatible = "arm,cortex-a9-gic"; #interrupt-cells = <3>; interrupt-controller; }; portb: gpio-controller@0 { compatible = "snps,dw-apb-gpio-port"; interrupt-controller; #interrupt-cells = <2>; interrupts = <0 165 4>; interrupt-parent = <&intc>; }; 

(The nodes and properties that we are not interested in are cut out to facilitate understanding)

i2c0 , keybs , inc and portb are nodes, everything else is their properties. From the code, it immediately becomes apparent that the keyboard controller chip is connected to the I2C bus. The compatible property is a string that describes the manufacturer and model of the device. It is for this property that the OS understands which driver needs to be associated with this device.

The interrupt-controller is a property that indicates that this device can be an interrupt controller, and the interrupt-parent indicates who the interrupt is from the current device.

# interrupt-cells is a property that indicates the number of parameters that describe interrupts for this interrupt controller, and interrupts is a property in which parameters are set for this interrupt.

For example, in portb it is indicated: # interrupt-cells = <2> This means that in nodes for which portb is an interrupt-parent in the interrupts property, two parameters need to be described. portb is the interrupt-parent for keybs . We look in keybs . It says: interrupts = <19 IRQ_TYPE_LEVEL_LOW>. What does it mean?

Two parameters are described here. The first is the pin number in the port portb , to which we have the interrupt line from the keyboard controller. The second is an interrupt type (low or high). How to find out how many parameters need to be described for the interrupt controller, and what each of them means? Usually this is written in the documentation. So, about portb is written in this file: Documentation / devicetree / bindings / gpio / snps-dwapb-gpio.txt .

& portb - link to the portb node (in our case, the link to the portb will be equal to / soc / gpio @ ff709000 / gpio-controller @ 0)
The remaining properties will not be needed for us, about them, and in general about Device Tree, you can read in detail here: devicetree.org/Device_Tree_Usage .

It will not be superfluous to mention the process of registering devices and drivers (do not worry, we will return to the main topic in the next paragraph already). According to the Linux Device Model:

A device is a physical or virtual object that is connected to the bus (perhaps also virtual)
Driver - a software object that can be associated with a device and can perform any control functions.
A tire is a device designed to be the “attachment point” of other devices. The basic functionality of all buses supported by the kernel is determined by the bus_type structure. In this structure, the subsys_private nested structure is declared, in which two lists are declared: klist_devices and klist_drivers .
klist_devices - the list of devices that are connected to the bus.
klist_drivers is a list of drivers that can manage devices on this bus.
Devices and drivers are added to these lists using the device_register and driver_register functions . In addition, device_register and driver_register bind the device with the appropriate driver. device_register goes through the list of drivers and tries to find a driver suitable for this device. ( driver_register passes through the list of devices and tries to find devices that it can control) Check whether the driver is suitable for the device using the match (dev, drv) function, the pointer to which is in the bus_type structure.



Now you can go to the main topic - the implementation of the mechanism of delayed loading of drivers. Let's look at the file drivers / base / dd.c. Here is a brief description of what we see there:

There are two lists for managing driver reloading - deferred_probe_pending_list and deferred_probe_active_list .

deferred_probe_pending_list - a list of devices for which a driver is missing some resources.
deferred_probe_active_list - a list of devices whose driver you can try to restart.

In the function really_probe , the probe function is called for the bus on which the device is located. In our case, this is the i2c_device_probe function and it looks like this: dev-> bus-> probe (dev) . The return value is checked for errors, and if it is -EPROBE_DEFER , then the device is added to the deferred_probe_pending_list .

But the most interesting is how and when the driver is called again. While drivers return -EPROBE_DEFER , devices are sequentially added to deferred_probe_pending_list . But as soon as the probe function has completed successfully for any driver, all devices from deferred_probe_pending_list are transferred to deferred_probe_active_list . It looks logical - it is possible that the driver that we last had successfully loaded, and was not enough for the normal loading of deferred drivers. A second attempt to launch drivers from deferred_probe_active_list is performed by the function deferred_probe_work_func . It calls bus_probe_device for each device in the list.

Calling bus_probe_device will eventually lead us again to the function really_probe for a pair of our device and its driver (see above).



But wait! We now talked about calling the probe function for the bus on which the device is located. That is about i2c_device_probe . But what about the probe function of the keyboard driver? No, we have not forgotten about it, it will just be called from i2c_device_probe . This can be seen by looking at its code in the file drivers / i2c / i2c-core.c :

I2c_device_probe code
 static int i2c_device_probe(struct device *dev) { struct i2c_client *client = i2c_verify_client(dev); struct i2c_driver *driver; int status; if (!client) return 0; driver = to_i2c_driver(dev->driver); if (!driver->probe || !driver->id_table) return -ENODEV; if (!device_can_wakeup(&client->dev)) device_init_wakeup(&client->dev, client->flags & I2C_CLIENT_WAKE); dev_dbg(dev, "probe\n"); status = of_clk_set_defaults(dev->of_node, false); if (status < 0) return status; status = dev_pm_domain_attach(&client->dev, true); if (status != -EPROBE_DEFER) { //   probe   (  ) status = driver->probe(client, i2c_match_id(driver->id_table, client)); if (status) dev_pm_domain_detach(&client->dev, true); } return status; } 


Okay, reloading seems to work, why then does the keyboard driver not get interrupt numbers?
Let's try to track how the interrupt number should get into our driver.

The adp5589_probe function (struct i2c_client * client, const struct i2c_device_id * id) is passed to the client structure, one of whose fields — irq — is the interrupt number that our device (keyboard controller) will generate. adp5589_probe is called from the function i2c_device_probe (struct device * dev). The device structure is passed to it, from the pointer to which the pointer to the i2c_client structure is calculated (using the magic of container_of macro).

A few words about container_of
This macro takes as input a pointer to the structure field, the type of this structure and the name of the field to which the pointer points, and returns a pointer to the structure itself.



About his work is well painted here .

So you need to find where the i2c_client structure is filled . It is filled in the function i2c_new_device (struct i2c_adapter * adap, struct i2c_board_info const * info); Specifically, the irq field is copied from the i2c_board_info structure field of the same name.

 struct i2c_client *client; client->irq = info->irq; 

The i2c_board_info structure is populated in the of_i2c_register_devices function (struct i2c_adapter * adap).

 info.irq = irq_of_parse_and_map(node, 0); 

irq_of_parse_and_map is a wrapper for a chain of two functions - of_irq_parse_one and irq_create_of_mapping ; The function of_irq_parse_one tries to find the node that is declared in the device tree as an interrupt-controller for the current device.
Remember these few lines in device tree?

 expander: pca9535@20 { interrupt-parent = <&portb>; }; 

It is the portb that is looking for of_irq_parse_one , and, based on the results of its work, it fills the structure of_phandle_args , which is passed to the irq_create_of_mapping function. irq_create_of_mapping already and returns the desired interrupt number.

The first time of_irq_parse_one does not find the GPIO port, for which it swears in the log:

irq: no irq domain found for / soc / gpio @ ff709000 / gpio-controller @ 0!

What happens when the driver is reloaded? And nothing. Only i2c_device_probe and adp5589_probe are called .
That's the problem. The interrupt is set only for the first time and remains so forever, no matter how much we reload our driver.

Found the problem, but how to fix it?

You can try to transfer the code to receive the interrupt to i2c_device_probe . Before that, we do not need an interrupt number anywhere, so there should be no problems.

But better, let's take a look at the source code of a more recent version of the kernel (we have version 3.18 installed) Here’s what we will see there:
The i2c client interrupt setting was transferred to the i2c_device_probe function.

 if (!client->irq && dev->of_node) { int irq = of_irq_get(dev->of_node, 0); if (irq == -EPROBE_DEFER) return irq; if (irq < 0) irq = 0; client->irq = irq; } 

Although the irq field remains in the i2c_board_info structure, it is not used. So the problem has been fixed in new versions of the kernel.

It remains only to transfer the changes to our version. All changes will affect the file drivers / i2c / i2c-core.c
Add the i2c client interrupt setting to our i2c_device_probe , which appeared in the latest version, and remove the interrupt setting in the of_i2c_register_devices function.

Changes in listing view from git diff
 --- a/drivers/i2c/i2c-core.c +++ b/drivers/i2c/i2c-core.c @@ -626,6 +626,17 @@ static int i2c_device_probe(struct device *dev) if (!client) return 0; + if (!client->irq && dev->of_node) { + int irq = of_irq_get(dev->of_node, 0); + + if (irq == -EPROBE_DEFER) + return irq; + if (irq < 0) + irq = 0; + + client->irq = irq; + } + driver = to_i2c_driver(dev->driver); if (!driver->probe || !driver->id_table) return -ENODEV; @@ -1407,7 +1418,12 @@ static void of_i2c_register_devices(struct i2c_adapter *adap) continue; } - info.irq = irq_of_parse_and_map(node, 0); + /* + * Now, we don't need to set interrupt here, because we set + * it in i2c_device_probe function + * info.irq = irq_of_parse_and_map(node, 0); + */ + info.of_node = of_node_get(node); info.archdata = &dev_ad; 


Check - the keyboard works. Look in / proc / interrupt:

 $ grep 'adp5589_keys' /proc/interrupts 305: 2 - 20 adp5589_keys 

Click a few buttons:

 $ grep 'adp5589_keys' /proc/interrupts 305: 6 - 20 adp5589_keys 

Problem solved.

Source: https://habr.com/ru/post/271983/


All Articles