2012-08-21 05:35:02

by Ming Lei

[permalink] [raw]
Subject: udev 182: response timeout for request_firmware in module_probe path

Hi Kay,

I found that udev 182 doesn't response for the request_firmware in
module_probe path until 30sec later after the 'ADD' event of firmware
device, but no such problem in udev175, sounds like a regression of udev?

I find there was a related discussion in [1], so CC guys who discussed before.

Just grep under kernel root dir, there are about 360 request_firmware callers,
and looks most of them are called in .probe path.


[1], https://lkml.org/lkml/2012/2/19/57

Thanks,
--
Ming Lei


2012-08-23 15:16:50

by Ming Lei

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

Cc systemd-devel

Hi,

On Tue, Aug 21, 2012 at 1:34 PM, Ming Lei <[email protected]> wrote:
> Hi Kay,
>
> I found that udev 182 doesn't response for the request_firmware in
> module_probe path until 30sec later after the 'ADD' event of firmware
> device, but no such problem in udev175, sounds like a regression of udev?

Looks udevd is capable of handling the firmware ADD event even though
the device ADD event is being processed( modprobe is triggered by device
ADD event).

The below patch[1] can fix the problem, could you give any comments on it?

>
> I find there was a related discussion in [1], so CC guys who discussed before.
>
> Just grep under kernel root dir, there are about 360 request_firmware callers,
> and looks most of them are called in .probe path.
>
>
> [1], https://lkml.org/lkml/2012/2/19/57


[1],

diff --git a/src/udev/udevd.c b/src/udev/udevd.c
index b4fc624..806721c 100644
--- a/src/udev/udevd.c
+++ b/src/udev/udevd.c
@@ -96,6 +96,7 @@ struct event {
const char *devpath_old;
dev_t devnum;
bool is_block;
+ bool is_firmware;
int ifindex;
};

@@ -439,6 +440,7 @@ static int event_queue_insert(struct udev_device *dev)
event->devpath_old = udev_device_get_devpath_old(dev);
event->devnum = udev_device_get_devnum(dev);
event->is_block = (strcmp("block",
udev_device_get_subsystem(dev)) == 0);
+ event->is_firmware = (strcmp("firmware",
udev_device_get_subsystem(dev)) == 0);
event->ifindex = udev_device_get_ifindex(dev);

udev_queue_export_device_queued(udev_queue_export, dev);
@@ -520,7 +522,7 @@ static bool is_devpath_busy(struct event *event)
}

/* parent device event found */
- if (event->devpath[common] == '/') {
+ if (event->devpath[common] == '/' && !event->is_firmware) {
event->delaying_seqnum = loop_event->seqnum;
return true;
}



Thanks,
--
Ming Lei

2012-08-23 15:31:33

by Kay Sievers

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

On Thu, Aug 23, 2012 at 5:16 PM, Ming Lei <[email protected]> wrote:
> On Tue, Aug 21, 2012 at 1:34 PM, Ming Lei <[email protected]> wrote:

>> I found that udev 182 doesn't response for the request_firmware in
>> module_probe path until 30sec later after the 'ADD' event of firmware
>> device, but no such problem in udev175, sounds like a regression of udev?
>
> Looks udevd is capable of handling the firmware ADD event even though
> the device ADD event is being processed( modprobe is triggered by device
> ADD event).

Calling out from inside the kernel and blocking in a firmware loading
userspace transaction during module_init() is kind of weird.

The firmware loading call should not rely on a fully functional
userspace, and modprobe should not hang and block until the firmware
request is handled.

The firmware should be requested asynchronously or from a different
context as module_init(). It depends on the type of driver/hardware
what's the best approach here.

Thanks,
Kay

2012-08-23 16:04:38

by Ming Lei

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

On Thu, Aug 23, 2012 at 11:31 PM, Kay Sievers <[email protected]> wrote:
> On Thu, Aug 23, 2012 at 5:16 PM, Ming Lei <[email protected]> wrote:
>> On Tue, Aug 21, 2012 at 1:34 PM, Ming Lei <[email protected]> wrote:
>
>>> I found that udev 182 doesn't response for the request_firmware in
>>> module_probe path until 30sec later after the 'ADD' event of firmware
>>> device, but no such problem in udev175, sounds like a regression of udev?
>>
>> Looks udevd is capable of handling the firmware ADD event even though
>> the device ADD event is being processed( modprobe is triggered by device
>> ADD event).
>
> Calling out from inside the kernel and blocking in a firmware loading
> userspace transaction during module_init() is kind of weird.

Strictly speaking, the request_firmware is not called by module_init()
directly, but called by driver .probe() which is triggered inside module_init().

I admit that it is weird if the driver is built in kernel. But considered that
most of drivers are built as module, it should be workable since the
userspace is already ready during module probing.

Also from the viewpoint of device driver, loading driver inside .probe()
is reasonable since the device may be powered on just now and
can't work further without the firmware.

Finally, the reality is that hundreds of drivers call request_firmware()
inside their.probe(), and converting them into its asynchronous pair
will introduce much much works, :-(

>
> The firmware loading call should not rely on a fully functional
> userspace, and modprobe should not hang and block until the firmware
> request is handled.

IMO, the driver probing path is allowed to sleep, so looks request firmware
should be allowed inside .probe().

In the patch I posted, it will make the situation workable at least.

>From udevd code, the firmware ADD event isn't run until its parent
device has been handled. IMO, looks like there is no obvious side
effect to loose the constraint for firmware device.

Correct me if it is wrong, and I am not familiar with udev.

>
> The firmware should be requested asynchronously or from a different
> context as module_init(). It depends on the type of driver/hardware
> what's the best approach here.

Requesting firmware asynchronously should be better, but may
introduce some complexity to drivers, for example, race between
request/release firmware context and hotplug contexts.


Thanks,
--
Ming Lei

2012-08-23 16:41:48

by Alan

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

> IMO, the driver probing path is allowed to sleep, so looks request firmware
> should be allowed inside .probe().

I'm not convinced about that. It can sleep but its holding various locks
in most cases, and it looks like that can end up in a complete heap.

By all means *request* the firmware asynchronously in the probe, but
there needs to be a seperate method somewhere after the probe to finish
the job once the firmware appears.

Alan

2012-08-23 17:03:38

by Ming Lei

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

On Fri, Aug 24, 2012 at 12:46 AM, Alan Cox <[email protected]> wrote:
>> IMO, the driver probing path is allowed to sleep, so looks request firmware
>> should be allowed inside .probe().
>
> I'm not convinced about that. It can sleep but its holding various locks
> in most cases, and it looks like that can end up in a complete heap.

Currently, only wait_for_completion() pends in the path of request_firmware
without holding other locks.

>
> By all means *request* the firmware asynchronously in the probe, but
> there needs to be a seperate method somewhere after the probe to finish
> the job once the firmware appears.

Yes, request asynchronously will work and introduce a bit complexity
(but it is not a big deal)

Looks the focus is that why request_firmware can't be used in driver
.probe().

IMO, we should consider that most of drivers call request_firmware
inside its .probe(), and maybe udev need to support the usage.


Thanks,
--
Ming Lei

2012-08-25 19:00:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

On Thu, Aug 23, 2012 at 8:31 AM, Kay Sievers <[email protected]> wrote:
>
> Calling out from inside the kernel and blocking in a firmware loading
> userspace transaction during module_init() is kind of weird.

It's *very* common.

I personally would prefer if drivers did their firmware loading not at
probe time, but at device open time, but that's not always necessarily
possible. More importantly, it's not actually how things are done.

So udev had better be fixed. The whole "no regressions" should be the
rule not just for the kernel, but it damn well should be the rule at
*least* also for projects like udev that are so closely related to the
kernel.

The "I wish things were otherwise than they are" mindset is wishful
thinking. Reality is that probing - and firmware loading - happens
from the module init routines quite often, and it clearly used to
work. So udev broke. Fix it, don't argue that you wish things were
otherwise.

Linus

2012-08-26 16:04:50

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: udev 182: response timeout for request_firmware in module_probe path

On Thu, 2012-08-23 at 17:46 +0100, Alan Cox wrote:
> > IMO, the driver probing path is allowed to sleep, so looks request
> firmware
> > should be allowed inside .probe().
>
> I'm not convinced about that. It can sleep but its holding various
> locks
> in most cases, and it looks like that can end up in a complete heap.
>
> By all means *request* the firmware asynchronously in the probe, but
> there needs to be a seperate method somewhere after the probe to
> finish
> the job once the firmware appears.

Not necessarily enough in the general case, for example some stacks will
cause a driver open to be call back from somewhere within the
register_foo() the driver did to register itself with it's subsystem
inside probe. For example register_framebuffer(), register_netdev(), ...

This is clearly a udev bug :-)

Cheers,
Ben.