2011-12-31 00:22:28

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Fri, Dec 30, 2011 at 3:54 PM, Dave Jones <[email protected]> wrote:
> We're getting a bunch of reports against Fedora 16
> (still using 3.1) which look like some drivers are trying to
> load firmware on resume from suspend, while usermodehelper
> is disabled.

Ok, buggy drivers. You *must*not* load firmware in your resume path,
since there is no actual guarantee that any particular device will be
resumed after the disk that contains the firmware images.

So it's very simple: drivers that load firmware at resume time are
buggy. No ifs, buts, or maybes about it.

> Here are some example traces:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=746411

It's isight_firmware_load(), in the isight_firmware driver. The driver
doesn't actually do anything but load the firmware, and is apparently
not very good at that either.

It should either fake a disconnect and reconnect of the device (and
let the reconnect then load the firmware through udev or something) or
it should just save the firmware image in memory from the original
load, and make the resume just re-initialize it - not load it.

It's also possible that it could be considered a USB layer bug, and
the USB layer should just not rebind the devices directly in the
resume function, but do it somehow later. HOWEVER, that would only
work for "random" USB devices that aren't in use by user space (like
disks etc might be). So I think that in general the real solution is
always just "make sure that the firmware is in memory before the
suspend even happens".

Greg - has the USB resume logic been changed lately?

Matthew? Any comments about that particular driver?

> https://bugzilla.redhat.com/show_bug.cgi?id=771002

Same issue, different driver. Again, it's USB, and it's possible that
USB just makes it really hard to do this correctly (ie the "save
firmware image across suspend so that you don't have to load it at
resume time").

It's also possible that we should blame the firmware code, which is
expressly written to encourage these kinds of bugs. It may be that i
tshould be the firmware code that has a "get_firmware()" +
"put_firmware()" model, and it should cache the firmware explicitly if
the config supports suspend, so that a firmware read at resume time
would actually work. The whole "request_firmware()" interface really
is very prone to these kinds of bugs.

But it's possible this could be fixed at the driver level by doing the
caching there.

In this case it's the rtl8192cu driver, so Larry, Chaoming, John etc
added to the cc for that one.

> This possibly sounds like the problem that caca9510ff4e5d842c0589110243d60927836222
> was trying to fix, but that patch is present in the kernels
> being reported.

No, caca9510ff4e is only for the case where you actually compile the
firmware *into* the kernel, so it's part of the kernel image. That's
useful mainly for avoiding modules and initrd images, thus allowing
things like having root directly on a disk that needs firmware to be
loaded. Quite unusual, and it doesn't really work all that well.

Oh, and some people use it for the Radeon firmware with the radeon DRM
code built it.

It really does need to be fixed at a driver level (possibly with the
help of firmware/usb support infrastructure).

Linus


2011-12-31 18:39:24

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On Fri, Dec 30, 2011 at 3:54 PM, Dave Jones <[email protected]> wrote:
> > We're getting a bunch of reports against Fedora 16
> > (still using 3.1) which look like some drivers are trying to
> > load firmware on resume from suspend, while usermodehelper
> > is disabled.
>
> Ok, buggy drivers. You *must*not* load firmware in your resume path,
> since there is no actual guarantee that any particular device will be
> resumed after the disk that contains the firmware images.
>
> So it's very simple: drivers that load firmware at resume time are
> buggy. No ifs, buts, or maybes about it.
>
> > Here are some example traces:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=746411
>
> It's isight_firmware_load(), in the isight_firmware driver. The driver
> doesn't actually do anything but load the firmware, and is apparently
> not very good at that either.
>
> It should either fake a disconnect and reconnect of the device (and
> let the reconnect then load the firmware through udev or something) or
> it should just save the firmware image in memory from the original
> load, and make the resume just re-initialize it - not load it.

Hey,

maybe we should implement such thing into the firmware loader itself? Allow it
-- for example via some node in /dev -- to force loading firmware into some
buffer in kernel just before suspend so it'll certainly be readily available at
resume time?

M

>
> It's also possible that it could be considered a USB layer bug, and
> the USB layer should just not rebind the devices directly in the
> resume function, but do it somehow later. HOWEVER, that would only
> work for "random" USB devices that aren't in use by user space (like
> disks etc might be). So I think that in general the real solution is
> always just "make sure that the firmware is in memory before the
> suspend even happens".
>
> Greg - has the USB resume logic been changed lately?
>
> Matthew? Any comments about that particular driver?
>
> > https://bugzilla.redhat.com/show_bug.cgi?id=771002
>
> Same issue, different driver. Again, it's USB, and it's possible that
> USB just makes it really hard to do this correctly (ie the "save
> firmware image across suspend so that you don't have to load it at
> resume time").
>
> It's also possible that we should blame the firmware code, which is
> expressly written to encourage these kinds of bugs. It may be that i
> tshould be the firmware code that has a "get_firmware()" +
> "put_firmware()" model, and it should cache the firmware explicitly if
> the config supports suspend, so that a firmware read at resume time
> would actually work. The whole "request_firmware()" interface really
> is very prone to these kinds of bugs.
>
> But it's possible this could be fixed at the driver level by doing the
> caching there.
>
> In this case it's the rtl8192cu driver, so Larry, Chaoming, John etc
> added to the cc for that one.
>
> > This possibly sounds like the problem that
> > caca9510ff4e5d842c0589110243d60927836222 was trying to fix, but that
> > patch is present in the kernels
> > being reported.
>
> No, caca9510ff4e is only for the case where you actually compile the
> firmware *into* the kernel, so it's part of the kernel image. That's
> useful mainly for avoiding modules and initrd images, thus allowing
> things like having root directly on a disk that needs firmware to be
> loaded. Quite unusual, and it doesn't really work all that well.
>
> Oh, and some people use it for the Radeon firmware with the radeon DRM
> code built it.
>
> It really does need to be fixed at a driver level (possibly with the
> help of firmware/usb support infrastructure).
>
> Linus
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-12-31 15:33:13

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sat, 31 Dec 2011, Matthew Garrett wrote:

> On Fri, Dec 30, 2011 at 04:22:06PM -0800, Linus Torvalds wrote:
>
> > It's isight_firmware_load(), in the isight_firmware driver. The driver
> > doesn't actually do anything but load the firmware, and is apparently
> > not very good at that either.
> >
> > It should either fake a disconnect and reconnect of the device (and
> > let the reconnect then load the firmware through udev or something) or
> > it should just save the firmware image in memory from the original
> > load, and make the resume just re-initialize it - not load it.
>
> Mm. My recollection is that these devices retained their firmware over
> suspend/resume, so wouldn't resume with a USB id that matched the driver
> and so this code shouldn't be called. It seems that either I was
> horribly wrong about that, or something's changed in the USB layer
> that's resulting in them resetting themselves. Newer devices don't
> require this, so I'll need to try to chase up some older hardware to
> figure out what's going on.

Nothing has changed in the USB layer -- it has always been true that
devices could be reset during a suspend/resume cycle. This isn't a
matter of how the stack is written or anything like that; some
motherboards simply do not provide suspend power to their USB
controllers. Or the firmware reinitializes the controllers and
attached devices during resume, forcing Linux's USB core to reset
every device on the affected buses.

When it comes to suspend/resume, there are almost no guarantees. :-(

Alan Stern


2011-12-31 16:09:54

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Samstag, 31. Dezember 2011, 16:33:12 schrieb Alan Stern:

> Nothing has changed in the USB layer -- it has always been true that
> devices could be reset during a suspend/resume cycle. This isn't a
> matter of how the stack is written or anything like that; some
> motherboards simply do not provide suspend power to their USB
> controllers. Or the firmware reinitializes the controllers and
> attached devices during resume, forcing Linux's USB core to reset
> every device on the affected buses.
>
> When it comes to suspend/resume, there are almost no guarantees. :-(

We are definitely going through do_unbind_rebind(). But I don't think
it matters why we got there. We seem to be calling probe() too early.
And we need to guarantee that a driver can request firmware in probe()

So something has changed in the resume code path further up.

Regards
Oliver


2011-12-31 00:40:26

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Fri, Dec 30, 2011 at 04:22:06PM -0800, Linus Torvalds wrote:

> It's isight_firmware_load(), in the isight_firmware driver. The driver
> doesn't actually do anything but load the firmware, and is apparently
> not very good at that either.
>
> It should either fake a disconnect and reconnect of the device (and
> let the reconnect then load the firmware through udev or something) or
> it should just save the firmware image in memory from the original
> load, and make the resume just re-initialize it - not load it.

Mm. My recollection is that these devices retained their firmware over
suspend/resume, so wouldn't resume with a USB id that matched the driver
and so this code shouldn't be called. It seems that either I was
horribly wrong about that, or something's changed in the USB layer
that's resulting in them resetting themselves. Newer devices don't
require this, so I'll need to try to chase up some older hardware to
figure out what's going on.

--
Matthew Garrett | [email protected]

2011-12-31 16:27:32

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sat, Dec 31, 2011 at 10:33:12AM -0500, Alan Stern wrote:

> Nothing has changed in the USB layer -- it has always been true that
> devices could be reset during a suspend/resume cycle. This isn't a
> matter of how the stack is written or anything like that; some
> motherboards simply do not provide suspend power to their USB
> controllers. Or the firmware reinitializes the controllers and
> attached devices during resume, forcing Linux's USB core to reset
> every device on the affected buses.

I know that this is hardware dependent, but the hardware hasn't changed
here.

--
Matthew Garrett | [email protected]

2012-01-01 09:54:50

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> Am Samstag, 31. Dezember 2011, 19:39:18 schrieb Marek Vasut:
> > maybe we should implement such thing into the firmware loader itself?
> > Allow it -- for example via some node in /dev -- to force loading
> > firmware into some buffer in kernel just before suspend so it'll
> > certainly be readily available at resume time?
>
> This is difficult. We don't know at suspend time which firmware we will
> need at resume time.

That's not actually true ... you can ask the drivers if they need to load
firmware after resume ... that can be implemented and udev can handle it. The
driver should know if the hardware it's controlling is stupid or not.

> We could keep a record of firmware we needed
> to get to the state we are, but this is a very complex scheme.

Not really. You might have a device where you load firmware, issue suspend
command and then issue resume command, while the device will still retain
firmware in it, it'll be only in low power mode throughout the sleep cycle.

Then you can have a device that will just power down completely, wiping the
firmware from it's memory. And that's what should be taken care of by injecting
the firmware before suspend. The driver should be able to tell you if the device
is broken and needs this special treatment.

This will though delay the time to suspect slightly, but that can be considered
a price you have to pay if your hardware is stupid.

M
>
> Regards
> Oliver

2012-01-03 12:19:20

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 03/01/12 11:57, Oliver Neukum wrote:
> Am Dienstag, 3. Januar 2012, 01:42:20 schrieb Alan Cox:
>> In that case however you don't want some generic firmware module knowing
>> all this crap, your driver can just request_firmware() the stuff as
>> modprobe and free it up on the module unload. For a typical 8bit firmware
>> of a few K you'll free a ton more memory unloading the module than the
>> firmware ! That I think actually covers the majority of devices under
>> discussion.
>
> I am afraid it doesn't, at least not fully.
> We have many devices whose primary (operational) driver does not load
> the firmware. That job is left to a secondary driver or user space.
>
> We could leave those secondary drivers and their firmware in RAM after
> their primary usage and except for a few pathological cases (which
> can be solved with a few udev rules preventively loading drivers) we'd
> be good, but we lack a mechanism for switching to a seconary driver
> and back during resumption.

You don't mension the bus used so I'm going to use USB as an example.

As I understand it the majority of devices that need firmware have two
different identities: the "bootstrap" id and the "primary" id. If we
always make sure we have the drivers for both bootsrap and primary ids
loaded then we should have the firmware in memory. If the device has
lost its firmware then it will reregister with the bootstrap id
otherwise it will continue with the primary id. Either way the correct
driver should pick it up and handle it.

That said, I guess there are more complicated / difficult pieces of
hardware that might not do that. Is that the case here? Can we detect
that the firmware needs reloading and have the primary driver "yield" to
the firmware driver?

Thanks,

Jack


2012-01-02 20:49:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 12:41 PM, Jack Stone <[email protected]> wrote:
>
> What about the case where the firmware that needs to be loaded is on the
> USB device that needs the firmware. This can be resolved at boot using
> an initrd but at resume time we don't have that.

Quite frankly, caching the firmware will just automagically fix this.

I don't understand why people even bother to talk about USB ID's etc
changing - that is totally irrelevant. We don't look up firmware by
USB ID's anyway.

So the *only* thing you need is:

- don't have the firmware loaded by some special user space helper -
use "load_firmware()" (which in turn then will use a user-space helper
to load it, but that's a separate issue)

- make sure "load_firmware()" caches the firmware by firmware name.

You can even make the caching be some simple "cache for at least 10
seconds after the last 'put' operation", and you'll be able to
trivially handle the case of a device "going away" only to be
immediately reconnected due to some electrical issue or whatever
(including changing USB ID's).

Even if the ID's change, the driver should damn well react to the new
ID and load the same damn firmware, because it's the same damn
physical device.

Why are you guys making it any more complicated than that?

Sure, if you have a user-space driver and you load the firmware
magically from user space too, and never use "request_firmware()" in a
real kernel driver, then you can never handle suspend/resume
correctly, BUT WHO THE F*CK CARES? At that point it's a "the kernel
cannot do it, because the kernel has nothing to do with it", but
exactly for the same reason it is TOTALLY POINTLESS to even bring it
up as an issue. It's not an issue - it's a broken user-space driver
that has nothing to do with the kernel, and shouldn't be discussed on
kernel lists as a kernel issue.

So the only thing we need is to fix the braindamaged firmware
interfaces. We need caching, and we need to replace (or at least
extend) "load_firware()" with "[get|put]_firmware()".

Linus

2012-01-03 00:37:13

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, Jan 03, 2012 at 12:20:48AM +0000, Alan Cox wrote:
> > No, the uvcvideo driver would need to call out to load the firmware
> > loader driver, otherwise you'll never know to load the firmware loader
> > driver.
>
> Does that not simply depend which driver has the relevant USB
> identifiers ?

If something presents as a class device then it's going to be picked up
by the class driver.

--
Matthew Garrett | [email protected]

2012-01-03 00:14:09

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 11:31:55PM +0000, Jack Stone wrote:

> Rather than merging the drivers we could make the uvcvideo (or any other
> generic driver) driver expose a "subdriver" interface that allows
> another module to bind to the specific ID and add extra handling to
> certain events. We could then use that interface to provide a firmware
> driver for each device that needs it. The firmware driver could bind to
> the bootstrap device and to the uvcvideo subdriver interface and it
> would then have all the lifetime info we need. [Credit to Linus for
> suggesting this idea]

No, the uvcvideo driver would need to call out to load the firmware
loader driver, otherwise you'll never know to load the firmware loader
driver.

--
Matthew Garrett | [email protected]

2012-01-03 09:33:46

by David Lang

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, 3 Jan 2012, Alexander E. Patrakov wrote:

> Matthew Garrett <[email protected]> wrote:
>
>> On Mon, Jan 02, 2012 at 01:27:03PM -0800, Linus Torvalds wrote:
>>
>>> If we didn't load the firmware before the suspend, then the resume
>>> function of a device sure as hell had better not load it at resume
>>> time either.
>>
>> If the hardware has lost its state then refusing to load the firmware
>> at resume time isn't going to leave you with a working device.
>>
>>> And for chrissake, don't bother making it more complicated than it
>>> is, just for some theoretical hardware or situation that nobody
>>> cares about.
>>
>> It's not theoretical hardware. This appears to be the current
>> behaviour of the isight devices. If you reboot they retain their
>> firmware. If you suspend, they don't. So if we have a flow like this:
>>
>> 1) user boots from cold. Device comes up with generic USB ID.
>> 2) isight_firmware loads and binds. Firmware is loaded. Device
>> disconnects and reconnects with an ID that's bound by the UVC driver.
>> 3) user reboots. Device comes up with UVC ID. isight_firmware doesn't
>> bind.
>> 4) user suspends.
>> 5) user resumes. isight_firmware binds and attempts to load firmware.
>>
>> then just caching the firmware is inadequate - we had never
>> previously seen the device on this boot, so we've never loaded it in
>> order to cache it. isight_firmware could unconditionally load the
>> firmware on module load just in case a device is plugged in, but that
>> seems even less elegant than caching it.
>
> What a heated discussion due to, essentially, a non-technical, legal
> issue! Remember that the whole "userspace firmware loader" saga
> together with the asynchronous firmware interface started when Debian
> started complaining over the non-freeness of the firmware being bundled
> as a part of the kernel module as an array of bytes. That design,
> however, never had such dependency issues. So maybe revert to it, with
> the following changes, and solve the legal issue seen by Debian by
> hiring a lawyer?

at the very least, there should be a simple compile option that says
'compile any firmware that may be needed into the kernel/module image' so
that those of us who are not worried about the distribution of firmware
blobs (or who don't believe that just splitting the firmware blobs into
a separate tree gains anything) can just opt out of this entire mess. As I
see it, today we have the option of manually specifying firmware blobs to
compile in, but no easy way to just include them all.

> 1. Make firmware a special case of a data-only non-GPL kernel module.
> Change the tainter so that it doesn't taint the kernel for data-only
> modules.
> 2. Make the actual driver depend on the relevant firmware modules for
> all devices supported by it, even if the devices don't always need the
> said firmware.
> 3. Disallow building drivers that need firmware as non-modules.

Please do not do this one.

David Lang

> 4. Do something (e.g. split the driver into a core and a shim, or make
> a fake firmware module) to allow the user to install without firmware
> if he knows that it works with his hardware.
> 5. Profit!
>
> This way, as long as the driver is loaded, the necessary firmware is
> also there, as a dependency.
>
>

2012-01-03 14:53:08

by Bernd Petrovitsch

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Die, 2012-01-03 at 15:16 +0600, Alexander E. Patrakov wrote:
[...]
> however, never had such dependency issues. So maybe revert to it, with
> the following changes, and solve the legal issue seen by Debian by
> hiring a lawyer?

You have to talk to the Debian people about this.

Sorry for the semi-off-topic,
Bernd
--
Bernd Petrovitsch Email : [email protected]
LUGA : http://www.luga.at


2012-01-02 21:09:11

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 21:00, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 12:55 PM, Jack Stone <[email protected]> wrote:
>>
>> The problem comes with knowing when to put the firmware - how do we tell
>> the generic driver that that device has firmware that might need to be
>> freed on disconnect. I don't know enough about USB to know if we can
>> tell if the same device has reconnected due to firmware being loaded or
>> if it was simply unplugged and a new device plugged in.
>
> Well, if we just have a 10-second timeout for the firmware flushing
> after the last "put" operation, we don't even care.
>
> It doesn't even *matter* if the device is a physically different
> device - if the driver loads the same firmware, it will be cached, and
> it will still be correct for that (physically different, but with the
> same firmware requirements) device. So even the whole "user switched
> devices around" issue is irrelevant.
>
> Of course, if the driver is some piece-of-crap that explicitly loads
> the wrong firmware for the device it manages, the firmware will be
> wrong, but hey, "Don't do that then". The drivers job very much
> includes being able to look at the device ID (whether USB or PCIe or
> PCMCIA) and figure out which firmware image it needs to load.

What about USB "class" drivers e.g. usb-storage. They handle any device
that reports itself as a usb mass storage device. There could be a
device that needs to be bootstrapped before it becomes a generic usb
mass storage device. Do we really want to have to write a new driver
that is almost identical to the generic driver but handles the USB
firmware correctly.

The other option is to extend the class drivers to understand that a
certain device needs this piece of firmware and to handle it but that
will bloat the generic drivers with lots of vendor stuff. We could do
this by lists of USB ids or something similar.

Thanks,

Jack


2012-01-01 21:39:31

by Michael Büsch

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, 1 Jan 2012 22:27:56 +0100
Oliver Neukum <[email protected]> wrote:

> Ideally usbcore would deal with such devices, but at present the USB layer
> is unable to cleanly resume devices that change their IDs during
> resume.
> Solving this would mean putting all firmware loaders into kernel space.
> And the mode switching logic as well.

I don't get it. Why would a device enter a state after resume, that it was not
in at _any_ time before the machine was suspended?

These change-id-on-bootstrap devices usually work like this,
as far as I know:

probe bootstrap device (switches hw to real device)
probe real device (firmware is loaded)
Suspend machine
Resume machine
usb detects that the device is "gone"
probe/resume bootstrap device (switches hw to real device)
probe/resume real device (No need to fetch fw from userspace. It's already cached)

What did I get wrong?

--
Greetings, Michael.

2012-01-02 23:53:25

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, 02 Jan 2012 20:41:00 +0000
Jack Stone <[email protected]> wrote:

> On 02/01/12 16:54, Alan Stern wrote:
> > What Oliver and I have been discussing is delaying the probe call until
> > after resume is finished, when userspace is back up and everything is
> > operating normally again. It would be more awkward than the current
> > code, but it definitely can be done.
>
> What about the case where the firmware that needs to be loaded is on the
> USB device that needs the firmware. This can be resolved at boot using
> an initrd but at resume time we don't have that.

There is a very simple answer to such conflicts - don't do it. If you
have firmware you need to reload to get stuff back which might form a
dependancy chain then cache it for the lifetime of the device being in
use. You will never sanely compute that dependancy chain IMHO (although
by all means prove me wrong!) given it might involve things like firmware
over raid over iscsi over nfs over a usb wireless link...

We have the same theoretical problem btw with kernel modules. if they are
on a usb device you need the driver loaded and kept loaded while the
driver is in use - we just happen to do that right anyway so it works
fine.

Alan

2012-01-02 08:33:39

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Sonntag, 1. Januar 2012, 22:39:13 schrieb Michael B?sch:

> These change-id-on-bootstrap devices usually work like this,
> as far as I know:
>
> probe bootstrap device (switches hw to real device)

And you will find no driver if the firmware is uploaded in user space.
udev runs a firmware loader that uses libusb. usb_modeswitch
presents you with the same problem.

> probe real device (firmware is loaded)

The real driver will often be generic and have no idea the device
needs firmware to turn it into a functional generic device. And we really don't
want to contaminate the generic drivers with that information.

So we'd need

a) kernel infrastructure to track "shadow" drivers for firmware upload
b) a heuristic to decide when a device underwent a virtual replug

> Suspend machine
> Resume machine
> usb detects that the device is "gone"
> probe/resume bootstrap device (switches hw to real device)

We need a way to verify this is really the device which we need firmware
for.

> probe/resume real device (No need to fetch fw from userspace. It's already cached)

Now we need to wait for the firmware upload to take effect. How long?
And we need to hold off telling the driver that its device was disconnected.

> What did I get wrong?

In addition you now need logic to tell the "shadow" drivers that a device
is gone so that they can free the firmware. Oh and of course you will need
to refcount firmware (but we probably need to do that anyway)

Regards
Oliver

2012-01-02 23:25:57

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 22:31, Marek Vasut wrote:
>> I don't think there is anyway to avoid the memory requirement if we want
>> to be able to resume transparently to user-space (or even resume at all
>> in some setups).
>
> Well ... injecting firmware into kernel with some userland helper just before
> suspend is no-go?

Its a perfectly good idea but you still need the full memory requirement
during the suspend.

Thanks,

Jack


2012-01-03 00:50:18

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> Well, if we just have a 10-second timeout for the firmware flushing
> after the last "put" operation, we don't even care.

So for a resume 10 seconds after the resume point on the basis that's
probably enough time to enumerate all the existing hardware without
errors ? (With errors it can be about 30 as my crappy Dell monitor USB
hub now and then likes to remind me)


Alan

2012-01-01 16:17:53

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, 1 Jan 2012, Oliver Neukum wrote:

> Am Sonntag, 1. Januar 2012, 03:21:45 schrieb Alan Stern:
> > On Sat, 31 Dec 2011, Oliver Neukum wrote:
>
> > > We are definitely going through do_unbind_rebind(). But I don't think
> > > it matters why we got there. We seem to be calling probe() too early.
> > > And we need to guarantee that a driver can request firmware in probe()
> > >
> > > So something has changed in the resume code path further up.
> >
> > For at least a year and a half, it has been true that rebinding takes
>
> Yes, so I don't understand why we haven't been hitting this issue
> all that time.

Perhaps if Matthew compared logs from an old and a new system with
appropriate debugging options configured, something might show up.

> > place during the complete phase of system resume. Clearly that is too
> > early to load firmware. When do you think we should do it instead?
>
> I think we ought to delay until right after user space is unfrozen.
> The devices are gone anyway, so we lose nothing but have the ability
> to call helpers.

That's true. But doing it will be a nuisance. The nice thing about
the current situation is that the driver core automatically iterates
over all interfaces.

> > And how should we keep track of which interfaces need rebinding?
>
> I guess this calls for a brute force approach. We can kick khubd and have
> it walk all busses checking for needs_binding.

Ugh. At a minimum, let's use a different process, not khubd.

As Linus pointed out, the real problem here is the firmware loader.
The way it is now, a driver can't always depend on the data being
available, even during a normal boot. It ought to use an asynchronous
approach; then none of these problems would arise.

Alan Stern


2012-01-03 00:44:24

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.



Marek Vasut <[email protected]> wrote:

>> On 02/01/12 22:31, Marek Vasut wrote:
>> >> I don't think there is anyway to avoid the memory requirement if
>we want
>> >> to be able to resume transparently to user-space (or even resume
>at all
>> >> in some setups).
>> >
>> > Well ... injecting firmware into kernel with some userland helper
>just
>> > before suspend is no-go?
>>
>> Its a perfectly good idea but you still need the full memory
>requirement
>> during the suspend.
>
>Hm ... and we can't have memory type that "can be swapped-out, but must
>be
>loaded back before suspend" in kernel, right?

Nope. Kernel swapping is a big headache and Linux doesn't do it. The only thing we can do is drop any clean cache pages, but that has performance implications.

Thanks,

Jack

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

2012-01-03 00:23:14

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.



Matthew Garrett <[email protected]> wrote:

>On Mon, Jan 02, 2012 at 11:31:55PM +0000, Jack Stone wrote:
>
>> Rather than merging the drivers we could make the uvcvideo (or any
>other
>> generic driver) driver expose a "subdriver" interface that allows
>> another module to bind to the specific ID and add extra handling to
>> certain events. We could then use that interface to provide a
>firmware
>> driver for each device that needs it. The firmware driver could bind
>to
>> the bootstrap device and to the uvcvideo subdriver interface and it
>> would then have all the lifetime info we need. [Credit to Linus for
>> suggesting this idea]
>
>No, the uvcvideo driver would need to call out to load the firmware
>loader driver, otherwise you'll never know to load the firmware loader
>driver.

So we have a table of USB ids that list the devices that need firmware drivers. When the uvcvideo driver registers the firmware interface with a USB id that matches the list we load the firmware driver and pass it a "warm start" event so it can cache the firmware.

As long as we can identify the devices that *might* need firmware then we are ok.

Thanks,

Jack

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

2012-01-03 02:45:59

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, 2 Jan 2012, Matthew Garrett wrote:

> It's not theoretical hardware. This appears to be the current behaviour
> of the isight devices. If you reboot they retain their firmware. If you
> suspend, they don't. So if we have a flow like this:
>
> 1) user boots from cold. Device comes up with generic USB ID.
> 2) isight_firmware loads and binds. Firmware is loaded. Device
> disconnects and reconnects with an ID that's bound by the UVC driver.
> 3) user reboots. Device comes up with UVC ID. isight_firmware doesn't
> bind.
> 4) user suspends.
> 5) user resumes. isight_firmware binds and attempts to load firmware.

Wait a second. Why does isight_firmware bind at this time? Binding to
new devices is handled by khubd, which doesn't start running again
until after the resume is finished (the device appears to be new
because its descriptors have changed). At that point there should be
no trouble reloading the firmware.

> then just caching the firmware is inadequate - we had never previously
> seen the device on this boot, so we've never loaded it in order to cache
> it. isight_firmware could unconditionally load the firmware on module
> load just in case a device is plugged in, but that seems even less
> elegant than caching it.
>
> Now this is obviously somewhat mitigated because isight_firmware won't
> have been autoloaded at (3), so won't be there at (5) unless the user's
> manually loaded it. But insmodding a driver shouldn't result in stuff
> breaking later.

I don't see how this matters very much. Certainly not enough to
justify all the effort put into this email thread already.

Regardless of what isight_firmware does, the original UVC device
instance is gone. The new device instance won't appear until the khubd
thread starts running again, after the resume is finished. As far as
userspace is concerned, that's all that matters.

Even in situations where isight_firmware does somehow manage to bind
during resume, the only question is how it can avoid hanging or
delaying the resume procedure. Doing an async firmware load will solve
that.

Alan Stern


2012-01-01 12:22:25

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Sonntag, 1. Januar 2012, 03:21:45 schrieb Alan Stern:
> On Sat, 31 Dec 2011, Oliver Neukum wrote:

> > We are definitely going through do_unbind_rebind(). But I don't think
> > it matters why we got there. We seem to be calling probe() too early.
> > And we need to guarantee that a driver can request firmware in probe()
> >
> > So something has changed in the resume code path further up.
>
> For at least a year and a half, it has been true that rebinding takes

Yes, so I don't understand why we haven't been hitting this issue
all that time.

> place during the complete phase of system resume. Clearly that is too
> early to load firmware. When do you think we should do it instead?

I think we ought to delay until right after user space is unfrozen.
The devices are gone anyway, so we lose nothing but have the ability
to call helpers.

> And how should we keep track of which interfaces need rebinding?

I guess this calls for a brute force approach. We can kick khubd and have
it walk all busses checking for needs_binding.

> P.S.: Oliver, it looks like there's a bunch of dead code in
> usb_suspend_interface() and usb_resume_interface(). I don't see how
> either one can be executed with a driver that doesn't have both suspend
> and resume methods. Such cases should be filtered out when
> usb_suspend() calls do_unbind_rebind(). The only odd thing that can
> happen is when a driver doesn't support reset-resume. Do you agree?

I agree.

Regards
Oliver

2012-01-03 12:24:38

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 09:53:46PM -0800, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 7:25 PM, Matthew Garrett <[email protected]> wrote:
> > On Mon, Jan 02, 2012 at 09:45:58PM -0500, Alan Stern wrote:
> >
> >> Wait a second. ?Why does isight_firmware bind at this time? ?Binding to
> >> new devices is handled by khubd, which doesn't start running again
> >> until after the resume is finished (the device appears to be new
> >> because its descriptors have changed). ?At that point there should be
> >> no trouble reloading the firmware.
> >
> > Then why are we getting this warning? The firmware is only loaded in the
> > probe function.
>
> The USB suspend/resume function does that "unbind/rebind" dance, and
> that causes a "device_attach()". Which causes a probe() to be called.
>
> Should it do that? I think not, not if the ID's haven't changed. But I
> don't know the USB layer all that well.

The IDs will have changed due to the firmware falling out.

--
Matthew Garrett | [email protected]

2012-01-01 16:26:21

by Michael Büsch

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, 1 Jan 2012 11:17:52 -0500 (EST)
Alan Stern <[email protected]> wrote:

> As Linus pointed out, the real problem here is the firmware loader.
> The way it is now, a driver can't always depend on the data being
> available, even during a normal boot. It ought to use an asynchronous
> approach; then none of these problems would arise.

request_firmware_nowait() exists.

--
Greetings, Michael.

2012-01-02 04:17:23

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On Sun, Jan 1, 2012 at 8:17 AM, Alan Stern <[email protected]> wrote:
> > As Linus pointed out, the real problem here is the firmware loader.
> > The way it is now, a driver can't always depend on the data being
> > available, even during a normal boot. It ought to use an asynchronous
> > approach; then none of these problems would arise.
>
> WRONG.
>
> Alan, you're not getting it. Loading firmware as part of
> suspend/resume is WRONG.
>
> It would not be made any better by an asynchronous approach, and that
> isn't the problem with request_firmware().
>
> Suspend/resume is *special*, and it's special for a very simple
> reason: unlike bootup or attaching a new device, suspend/resume
> happens WHILE THE USER IS ACTIVE.
>
> Loading firmware at that time is wrong. It's impossible. You have to
> have the firmware available *before* any processes might need it, but
> at the same time actually loading the firmware may need help from user
> space and/or other devices. It's a chicken-and-egg problem.
>
> So let me repeat one more time: Loading firmware at resume time is a
> device driver bug. Seriously. How many times do I have to say it?
>
> And making it asynchronous doesn't make it *any* less of a bug. It
> doesn't change anything but timing, but the problem was never about
> timing in the first place! The problem was about dependencies. User
> space may well depend on the device, and *other* devices may well
> depend on the device working.
>
> So stop with the inanity. I've already told people what the fix is:
> make sure that the firmware image is in memory before the suspend ever
> happens, and just reload it from memory. NOT with
> "request_firmware()". Because requesting the firmware at resume time
> is buggy and wrong, and has nothing to do with "asynchronous" or
> "synchronous". It has everything to do with "it's buggy".
>
> Really really really.
>
> So the problem with request_firmware() has absolutely nothing to do
> with "asynchronous". The problem is that the firmware interfaces do
> not cache the data in memory, and currently *cannot* sanely cache the
> firmware data simply because the interface doesn't have any kind of
> lifetime rules.
>
> In other words: we could make "request_firmware()" cache *all*
> firmware images forever, but there is currently no way to say "ok, the
> device was unplugged - and not fakily so by a resume event, but for
> real, and physically - so now you can drop the cache". Which means
> that effectively request_firmware() can do no caching at all, because
> it might eventually just run out of memory.
>
> It is *possible* that we might tie the firmware lifetime rules to the
> driver module lifetime. But it would probably be much better if we
> made for an explicit model of "ok, the device is now really gone" so
> that it would work properly with compiled-in drivers etc too.
>
> And yes, the firmware would have to stay around even around a
> resume/suspend that causes unplug events over USB. The "use USB and
> lose power" actually happens also for built-in devices that may well
> be disks and network cards that are *needed* by user space, so even if
> the device has been electrically unplugged, it is still attached and
> needs to be brought back *before* user space comes back.
>
> That's the whole point of suspend/resume: we're not starting from a
> clean slate. We are supposed to continue as if nothing happened!
>
> Linus

FYI this is not only problem with USB, but with PCMCIA too for example.

M

2012-01-02 23:59:49

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, 2 Jan 2012 13:27:03 -0800
Linus Torvalds <[email protected]> wrote:

> On Mon, Jan 2, 2012 at 1:19 PM, Matthew Garrett <[email protected]> wrote:
> > On Mon, Jan 02, 2012 at 12:48:48PM -0800, Linus Torvalds wrote:
> >
> >> Why are you guys making it any more complicated than that?
> >
> > Because it's inadequate. You can't guarantee that we ever loaded
> > firmware.
>
> If we didn't load the firmware before the suspend, then the resume
> function of a device sure as hell had better not load it at resume
> time either.

Get a clue. It is very common that devices keep firmware over OS warm
boots, particularly if onboard. "Lesser" beings than yourself consider it
a feature of good hardware design for fast booting and it works happily
in "lesser" OS products such Windows.

If you'd like a typical worked example the Fujitsu Q550 has a 3G modem on
it. So in the following cases we don't load firmware at boot but we do
need to after hibernate/resume

Boot windows, reboot to Linux
Boot Linux, reboot to Linux (first loads, second doesn't)

Break that and you have a regression and Linus says regressions are wrong
and we don't do them.

> And don't make the stupid argument that we don't know. That's just
> inane. Either the driver loads the firmware at startup, or it doesn't.

Wrong.

> Fix the 99%. Screw the crazy shit, don't even bother worrying about it
> until *after* the 99% is fixed.

This is the 99%. This is PC hardware, the 99% is crazy shit.

Alan

2012-01-03 00:31:20

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On 02/01/12 22:31, Marek Vasut wrote:
> >> I don't think there is anyway to avoid the memory requirement if we want
> >> to be able to resume transparently to user-space (or even resume at all
> >> in some setups).
> >
> > Well ... injecting firmware into kernel with some userland helper just
> > before suspend is no-go?
>
> Its a perfectly good idea but you still need the full memory requirement
> during the suspend.

Hm ... and we can't have memory type that "can be swapped-out, but must be
loaded back before suspend" in kernel, right?

>
> Thanks,
>
> Jack

2012-01-02 21:26:16

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 21:19, Matthew Garrett wrote:
> On Mon, Jan 02, 2012 at 12:48:48PM -0800, Linus Torvalds wrote:
>
>> Why are you guys making it any more complicated than that?
>
> Because it's inadequate. You can't guarantee that we ever loaded
> firmware. The same hardware might maintain state over warm reboots but
> not over suspend. We never loaded the firmware loader driver. We never
> called request_firmware(). We have no way of knowing whether the device
> is an older one that needs firmware loading, or a newer one which has it
> in flash.
>

Can we tell by the USB ids that this is a device that might need
firmware and load it into memory anyway - it would use more memory but
would always work as we would have the firmware.

Thanks,

Jack

2012-01-02 22:12:43

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 02:03:27PM -0800, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 1:50 PM, Matthew Garrett <[email protected]> wrote:
> > On Mon, Jan 02, 2012 at 01:27:03PM -0800, Linus Torvalds wrote:
> >
> >> If we didn't load the firmware before the suspend, then the resume
> >> function of a device sure as hell had better not load it at resume
> >> time either.
> >
> > If the hardware has lost its state then refusing to load the firmware at
> > resume time isn't going to leave you with a working device.
>
> What the heck is your problem?
>
> Go back and read it.
>
> If it wasn't loaded before, THEN IT WASN'T WORKING BEFORE EITHER! And
> if it was loaded before, then it would be cached (and thus trivially
> reloaded without having to invoke user-space or disk devices that may
> not be running) and thus loading it would work.

The kernel that loaded the firmware to make the device work is not
required to be the same kernel that we're running when we suspend. The
hardware keeps its firmware over warm reboots, and if it already has
loaded firmware then we have no way of knowing that it needs firmware at
all.

--
Matthew Garrett | [email protected]

2012-01-03 15:16:19

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, 3 Jan 2012, Oliver Neukum wrote:

> > The USB suspend/resume function does that "unbind/rebind" dance, and
> > that causes a "device_attach()". Which causes a probe() to be called.
> >
> > Should it do that? I think not, not if the ID's haven't changed. But I
> > don't know the USB layer all that well.
>
> The ID has changed. That is why we run into this problem.
> And indeed the USB layer does the rebind only if either
>
> a) ID has changed

That's not true, or at least, it shouldn't be true.

If the ID has changed then there's nothing to rebind. The old device
is gone; it cannot be resumed or rebound at all. That's the theory,
and indeed things really do work that way during reset-resume.

Now, during an ordinary resume, the USB core does not check IDs. It
assumes that if suspend power was provided to the bus and the device
was connected to the bus the entire time, then the device remains
unchanged. Perhaps this assumption is a mistake, but so far I haven't
seen any data to prove that it is.

> b) the driver tells us that it cannot deal with a device that has lost its state

This case applies only in situations where the driver doesn't have a
reset_resume method. The uvc driver does have one.

(You left out a third case: The driver doesn't have any suspend/resume
methods. But this case applies only when the device has not lost its
state, and besides, the uvc driver does have these methods.)

Alan Stern


2012-01-03 00:31:06

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> So we have a table of USB ids that list the devices that need firmware drivers. When the uvcvideo driver registers the firmware interface with a USB id that matches the list we load the firmware driver and pass it a "warm start" event so it can cache the firmware.
>
> As long as we can identify the devices that *might* need firmware then we are ok.

You just broke the module loading guarantees that stop it all coming apart
in the first place. Probably not in many real world ways but not good.

If we warm booted from Windows into Linux and then suspended then on
resume we won't have the driver for the firmware loaded present - we
never saw the 'need firmware' id.

That moves us from a dependancy chain for firmware to one including
firmware and modules (imagine /lib/modules over iscsi for an iscsi
adapter needing firmware).

At the moment I don't think anyone has been dumb enough to do firmware
requiring storage and NFS root over firmware requiring USB devices is
pretty dumb. I don't however think it's a good hole to dig ?

Alan


2012-01-02 20:41:06

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 16:54, Alan Stern wrote:
> What Oliver and I have been discussing is delaying the probe call until
> after resume is finished, when userspace is back up and everything is
> operating normally again. It would be more awkward than the current
> code, but it definitely can be done.

What about the case where the firmware that needs to be loaded is on the
USB device that needs the firmware. This can be resolved at boot using
an initrd but at resume time we don't have that.

Thanks,

Jack


2012-01-02 22:19:59

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 2:12 PM, Matthew Garrett <[email protected]> wrote:
>
> The kernel that loaded the firmware to make the device work is not
> required to be the same kernel that we're running when we suspend.

Why do you make up all these idiotic theoretical cases that nobody
cares about and has no relevance what-so-ever for the 99%?

Seriously? It's just stupid.

If you have some multi-kernel setup, or depend on booting windows
before booting Linux in order to have the firmware there, or you use
hibernate and switch things around (hibernate, not suspend - different
thing), and cannot load the firmware in the regular kernel, who the
hell cares?

And if you *can* load the firmware in the new kernel, just do it
before the suspend event. Problem solved.

And once more: your made-up scenario has *nothing* what-so-ever to do
with the actual warnings that started this whole thread.

You seem to *intentionally* be off in some random alternate reality
that is not relevant to anybody else, or to the actual reported
problems at hand.

Why?

Linus

2012-01-03 16:29:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, Jan 3, 2012 at 4:24 AM, Matthew Garrett <[email protected]> wrote:
>
> The IDs will have changed due to the firmware falling out.

The thing is, I don't think it has.

You said this used to work. And I suspect that the ID's haven't
changed at all. The unbind/rebind kind of supports that (it shouldn't
re-bind if it's not the same device, afaik). I suspect the bug report
is from a system that didn't actually have the firmware loaded even
before the suspend (so no USB ID changes, just the stupid
unbind/rebind).

Of course, it's possible that it "used to work" just because we used
to have that "we'll try to load things even if user space isn't
ready", which caused 30-second timeouts for the random cases where it
didn't work, but also silently "just worked" for the other random
cases where it did happen to work. But I don't think unbind/rebind is
correct for an ID change, so that really doesn't make sense as a
explanation either (at least for *this* particular bug report).

Linus

2012-01-03 05:54:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 7:25 PM, Matthew Garrett <[email protected]> wrote:
> On Mon, Jan 02, 2012 at 09:45:58PM -0500, Alan Stern wrote:
>
>> Wait a second. ?Why does isight_firmware bind at this time? ?Binding to
>> new devices is handled by khubd, which doesn't start running again
>> until after the resume is finished (the device appears to be new
>> because its descriptors have changed). ?At that point there should be
>> no trouble reloading the firmware.
>
> Then why are we getting this warning? The firmware is only loaded in the
> probe function.

The USB suspend/resume function does that "unbind/rebind" dance, and
that causes a "device_attach()". Which causes a probe() to be called.

Should it do that? I think not, not if the ID's haven't changed. But I
don't know the USB layer all that well.

The whole USB suspend/resume code is also very confused in general.
See for example commit c043f1245654 ("USB: unbind all interfaces
before rebinding them") which talks about how the USB layer needs to
unbind all interfaces in order to rebind them later. But then you
look at the *code*, and it actually does a DO_REBIND, not an unbind!

The USB suspend/resume code is full of those crazy "let's use one
function, and pass it as an *argument* what to do". It's a disease.
The PM layer used to do the same thing (PMSG_xyz crap), and we've
largely gotten rid of it, but USB still plays around with those things
and makes it even *worse* exactly with these kinds of
"do_unbind_rebind()" routines that then look at the argument instead
of having a sane routine for unbinding and another sane routine for
re-binding.

So I can't follow the crap. I *can* follow the fact that people who
write the code (or the commit messages) are clearly totally confused
by it, even when they maintain it, as exemplified by the above
example.

Alan?

Linus

2012-01-01 17:06:10

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> > Am Sonntag, 1. Januar 2012, 10:54:47 schrieb Marek Vasut:
> > > > Am Samstag, 31. Dezember 2011, 19:39:18 schrieb Marek Vasut:
> > > > > maybe we should implement such thing into the firmware loader
> > > > > itself? Allow it -- for example via some node in /dev -- to force
> > > > > loading firmware into some buffer in kernel just before suspend so
> > > > > it'll certainly be readily available at resume time?
> > > >
> > > > This is difficult. We don't know at suspend time which firmware we
> > > > will need at resume time.
> > >
> > > That's not actually true ... you can ask the drivers if they need to
> > > load firmware after resume ... that can be implemented and udev can
> > > handle it. The driver should know if the hardware it's controlling is
> > > stupid or not.
> >
> > Device IDs can morph due to a power loss. After resumption another driver
> > would bind.
>
> What do you mean? I don't think I quite follow.

Ah I see ... you mean like you can swap the devices (eg. usb devices) when the
computer is asleep. Yea, that's an issue.

On the other hand, you'd expect those to keep pestering udev to load firmware
later in the wakeup process (maybe it can be somehow defered?). But if we
actually implemented some pre-suspend FW cache, it might help the speed of
resume a bit, right ?

M
>
> M
>
> > Regards
> >
> > Oliver

2012-01-03 09:40:14

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Matthew Garrett <[email protected]> wrote:

> On Mon, Jan 02, 2012 at 01:27:03PM -0800, Linus Torvalds wrote:
>
> > If we didn't load the firmware before the suspend, then the resume
> > function of a device sure as hell had better not load it at resume
> > time either.
>
> If the hardware has lost its state then refusing to load the firmware
> at resume time isn't going to leave you with a working device.
>
> > And for chrissake, don't bother making it more complicated than it
> > is, just for some theoretical hardware or situation that nobody
> > cares about.
>
> It's not theoretical hardware. This appears to be the current
> behaviour of the isight devices. If you reboot they retain their
> firmware. If you suspend, they don't. So if we have a flow like this:
>
> 1) user boots from cold. Device comes up with generic USB ID.
> 2) isight_firmware loads and binds. Firmware is loaded. Device
> disconnects and reconnects with an ID that's bound by the UVC driver.
> 3) user reboots. Device comes up with UVC ID. isight_firmware doesn't
> bind.
> 4) user suspends.
> 5) user resumes. isight_firmware binds and attempts to load firmware.
>
> then just caching the firmware is inadequate - we had never
> previously seen the device on this boot, so we've never loaded it in
> order to cache it. isight_firmware could unconditionally load the
> firmware on module load just in case a device is plugged in, but that
> seems even less elegant than caching it.

What a heated discussion due to, essentially, a non-technical, legal
issue! Remember that the whole "userspace firmware loader" saga
together with the asynchronous firmware interface started when Debian
started complaining over the non-freeness of the firmware being bundled
as a part of the kernel module as an array of bytes. That design,
however, never had such dependency issues. So maybe revert to it, with
the following changes, and solve the legal issue seen by Debian by
hiring a lawyer?

1. Make firmware a special case of a data-only non-GPL kernel module.
Change the tainter so that it doesn't taint the kernel for data-only
modules.
2. Make the actual driver depend on the relevant firmware modules for
all devices supported by it, even if the devices don't always need the
said firmware.
3. Disallow building drivers that need firmware as non-modules.
4. Do something (e.g. split the driver into a core and a shim, or make
a fake firmware module) to allow the user to install without firmware
if he knows that it works with his hardware.
5. Profit!

This way, as long as the driver is loaded, the necessary firmware is
also there, as a dependency.

--
Alexander E. Patrakov


2012-01-03 11:56:05

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Dienstag, 3. Januar 2012, 01:42:20 schrieb Alan Cox:
> In that case however you don't want some generic firmware module knowing
> all this crap, your driver can just request_firmware() the stuff as
> modprobe and free it up on the module unload. For a typical 8bit firmware
> of a few K you'll free a ton more memory unloading the module than the
> firmware ! That I think actually covers the majority of devices under
> discussion.

I am afraid it doesn't, at least not fully.
We have many devices whose primary (operational) driver does not load
the firmware. That job is left to a secondary driver or user space.

We could leave those secondary drivers and their firmware in RAM after
their primary usage and except for a few pathological cases (which
can be solved with a few udev rules preventively loading drivers) we'd
be good, but we lack a mechanism for switching to a seconary driver
and back during resumption.

Regards
Oliver

2012-01-02 21:01:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 12:55 PM, Jack Stone <[email protected]> wrote:
>
> The problem comes with knowing when to put the firmware - how do we tell
> the generic driver that that device has firmware that might need to be
> freed on disconnect. I don't know enough about USB to know if we can
> tell if the same device has reconnected due to firmware being loaded or
> if it was simply unplugged and a new device plugged in.

Well, if we just have a 10-second timeout for the firmware flushing
after the last "put" operation, we don't even care.

It doesn't even *matter* if the device is a physically different
device - if the driver loads the same firmware, it will be cached, and
it will still be correct for that (physically different, but with the
same firmware requirements) device. So even the whole "user switched
devices around" issue is irrelevant.

Of course, if the driver is some piece-of-crap that explicitly loads
the wrong firmware for the device it manages, the firmware will be
wrong, but hey, "Don't do that then". The drivers job very much
includes being able to look at the device ID (whether USB or PCIe or
PCMCIA) and figure out which firmware image it needs to load.

Linus

2012-01-02 22:31:32

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On 02/01/12 21:52, Marek Vasut wrote:
> >> On 02/01/12 21:23, Linus Torvalds wrote:
> >>> On Mon, Jan 2, 2012 at 1:09 PM, Jack Stone <[email protected]> wrote:
> >>>> What about USB "class" drivers e.g. usb-storage. They handle any
> >>>> device that reports itself as a usb mass storage device. There could
> >>>> be a device that needs to be bootstrapped before it becomes a generic
> >>>> usb mass storage device. Do we really want to have to write a new
> >>>> driver that is almost identical to the generic driver but handles the
> >>>> USB firmware correctly.
> >>>
> >>> I'd hope that the generic driver just expose enough interfaces that
> >>> you could basically do a "firmware-load" driver that just loads the
> >>> firmware and then attaches the device to the generic driver.
> >>
> >> Sounds workable.
> >>
> >> To make the firmware caching easier I would propose one extra function
> >> in addition to the aforemensioned get_firmware / put_firmware - a
> >> find_firmware function to search the cache and return the appropriate
> >> firmware blob. It should only be called if the caller already has a
> >> refcount to the firmware, it's only use is to save every driver saving a
> >> pointer to the firmware.
> >>
> >> If noone beats me to it I will try and put together an RFC for a new
> >> version.
> >
> > The problem is on systems with limited resources, most notably RAM. If
> > you plug in many devices at once, many firmwares will be cached at one
> > point, efectivelly doing DoS attach on the machine?
> >
> > Also, how will this solve the suspect-resume issue? What if the device
> > suspends only occasionally -- like every 24 hours -- then you'd have the
> > FW cached all the time too?
>
> Yes, at least to begin with. If we can come up with a robust scheme
> which doesn't require the firmware to be kept in memory then that would
> also be workable.
>
> For example, drivers which know they don't ever need the firmware again
> wouldn't need to cache it. That would probably be quite a small list -
> there are systems that cut power to USB devices over suspend and so
> those devices would need the firmware reloading.

That's the problem -- there are devices that can suspend, but in the end, the
port turns of the power to those devices and they loose fw anyway.
>
> I don't think there is anyway to avoid the memory requirement if we want
> to be able to resume transparently to user-space (or even resume at all
> in some setups).

Well ... injecting firmware into kernel with some userland helper just before
suspend is no-go?

M

>
> Thanks,
>
> Jack

2012-01-01 09:46:40

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Samstag, 31. Dezember 2011, 19:39:18 schrieb Marek Vasut:

> maybe we should implement such thing into the firmware loader itself? Allow it
> -- for example via some node in /dev -- to force loading firmware into some
> buffer in kernel just before suspend so it'll certainly be readily available at
> resume time?

This is difficult. We don't know at suspend time which firmware we will
need at resume time. We could keep a record of firmware we needed
to get to the state we are, but this is a very complex scheme.

Regards
Oliver

2012-01-03 13:36:34

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> How so? You still need to get the firmware to the device before
> you unfreeze user space to get a benefit from this. Having it in
> RAM is only half the deal.

And this can be handled by the existing drivers happily as they do now
rather than centralised in some kind of giant firmware turd that will be
horrible to maintain.

The driver visible differences will be

- you can take a reference to a firmware to ensure it will be there at
resume
- an actual grab and use of that firmware during resume will not block


Alan

2012-01-02 21:31:03

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 21:23, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 1:09 PM, Jack Stone <[email protected]> wrote:
>>
>> What about USB "class" drivers e.g. usb-storage. They handle any device
>> that reports itself as a usb mass storage device. There could be a
>> device that needs to be bootstrapped before it becomes a generic usb
>> mass storage device. Do we really want to have to write a new driver
>> that is almost identical to the generic driver but handles the USB
>> firmware correctly.
>
> I'd hope that the generic driver just expose enough interfaces that
> you could basically do a "firmware-load" driver that just loads the
> firmware and then attaches the device to the generic driver.

Sounds workable.

To make the firmware caching easier I would propose one extra function
in addition to the aforemensioned get_firmware / put_firmware - a
find_firmware function to search the cache and return the appropriate
firmware blob. It should only be called if the caller already has a
refcount to the firmware, it's only use is to save every driver saving a
pointer to the firmware.

If noone beats me to it I will try and put together an RFC for a new
version.

Thanks,

Jack


2012-01-03 13:42:38

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> What a heated discussion due to, essentially, a non-technical, legal
> issue! Remember that the whole "userspace firmware loader" saga
> together with the asynchronous firmware interface started when Debian
> started complaining over the non-freeness of the firmware being bundled
> as a part of the kernel module as an array of bytes. That design,

Actually thats only a part of it, and irrelevant anyway.

Compiling in the firmware is 1:1 mapping with doing request_firmware at
module load and freeing it on module unload. We can do that today, and in
fact some drivers with small firmware do. The other reason it exists is
because we have drivers that have megabytes of firmware data attached to
them, sometimes several versions for different chip versions. In those
cases neither compiling it in nor loading it on module load is a workable
solution because of the memory usage.

In some of the extreme cases like qcserial I don't think anyone has even
been able to count all the firmwares that exist !

> however, never had such dependency issues. So maybe revert to it, with
> the following changes, and solve the legal issue seen by Debian by
> hiring a lawyer?

It may have escaped your notice but a lot of Linux companies have
corporate lawyers and do talk to them about such issues and have been for
years.

Alan

2012-01-02 03:24:58

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On Sun, 1 Jan 2012 20:39:45 +0000
>
> Alan Cox <[email protected]> wrote:
> > When you suspend the power gets killed so the device loses its firmware
> > and goes back to being a firmware requesting thing on resume.
> >
> > Worse still - you don't easily know if the device is in fact new and was
> > added while suspended, or was always there.
> >
> > So for those devices you do need to load the firmware into them
> > automatically after the resume to work out what they are and get the MAC
> > to see if its the same wireless card or not.
>
> Well, that does not prevent you from caching the firmware once you
> got it from userspace and keep it until module unload (or probably device
> close), so that it is already available on resume.

That's actually wrong. If you cached every single firmware, the kernel would
gulp down a lot of space that can't be swapped out!

M

2012-01-02 21:19:15

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 12:48:48PM -0800, Linus Torvalds wrote:

> Why are you guys making it any more complicated than that?

Because it's inadequate. You can't guarantee that we ever loaded
firmware. The same hardware might maintain state over warm reboots but
not over suspend. We never loaded the firmware loader driver. We never
called request_firmware(). We have no way of knowing whether the device
is an older one that needs firmware loading, or a newer one which has it
in flash.

--
Matthew Garrett | [email protected]

2012-01-02 08:38:10

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On Mon, Jan 2, 2012 at 5:17 AM, Marek Vasut <[email protected]> wrote:
> > FYI this is not only problem with USB, but with PCMCIA too for example.
> >
> > M
>
> Also, PCI Express. Some Nvidia graphics cards only assume their real
> PCI device ID during video POST (which can happen either as part of
> system POST, if the card is primary, or during Linux startup/resume,
> for a secondary card).

Now that's really screwed. Is that "nvidia/intel optimus" also affected by this
? (that thing where you have two GPUs in one laptop).

M

2012-01-01 20:39:20

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> > Device IDs can morph due to a power loss. After resumption another driver
> > would bind.
>
> What do you mean? I don't think I quite follow.

Lots of USB hardware appears as one device id when powered on, then when
the firmware is loaded turns into something else that becomes relevant to
the user.

So on powerup your wireless card goes

'I've no idea what I am, feed me firmware for this id"

when it gets the firmware it disconnects, reconnects and says

"I am a USB wireless card with this id"

When you suspend the power gets killed so the device loses its firmware
and goes back to being a firmware requesting thing on resume.

Worse still - you don't easily know if the device is in fact new and was
added while suspended, or was always there.

So for those devices you do need to load the firmware into them
automatically after the resume to work out what they are and get the MAC
to see if its the same wireless card or not.


Alan

2012-01-02 16:54:20

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, 1 Jan 2012, Linus Torvalds wrote:

> On Sun, Jan 1, 2012 at 8:17 AM, Alan Stern <[email protected]> wrote:
> >
> > As Linus pointed out, the real problem here is the firmware loader.
> > The way it is now, a driver can't always depend on the data being
> > available, even during a normal boot. ?It ought to use an asynchronous
> > approach; then none of these problems would arise.
>
> WRONG.
>
> Alan, you're not getting it. Loading firmware as part of
> suspend/resume is WRONG.

Having read through the remainder of your message, I now see your point
and agree with it. However, you misunderstood what I was trying to
say.

(For example, consider the issue of firmware loading during bootup.
Wouldn't that work out better if an async approach was always used?
And doesn't this argue that drivers should never do synchronous
firmware loads?)

The issue with firmware, USB, and suspend/resume involves several
considerations. The basic fact is this:

USB devices sometimes lose their firmware during suspend.
Before the device can be used again, the firmware has to be
reloaded.

The question is how and when to reload the firmware. As a general
rule, current drivers do this synchronously as part of their probe
routine (though there are exceptions). For all I know, some drivers
may try to do this even if the firmware is already loaded. The problem
we face is that the USB core may call the probe routine during the
resume procedure. And as we know, during resume is not a good time to
load firmware.

What Oliver and I have been discussing is delaying the probe call until
after resume is finished, when userspace is back up and everything is
operating normally again. It would be more awkward than the current
code, but it definitely can be done.

> It would not be made any better by an asynchronous approach, and that
> isn't the problem with request_firmware().

Leaving aside questions about request_firmware()'s merits, consider
what would happen if drivers loaded firmware asynchronously during
probe. The USB core could call the probe routine while the system was
resuming, the probe routine would return immediately, and the actual
firmware load would not take place until later when userspace was
operating. Then everything would look the same as if probe had been
called after the system resume was finished.

If, as mentioned parenthetically above, drivers should always load
firmware asynchronously, then the USB core wouldn't need to be changed
at all.

> Suspend/resume is *special*, and it's special for a very simple
> reason: unlike bootup or attaching a new device, suspend/resume
> happens WHILE THE USER IS ACTIVE.
>
> Loading firmware at that time is wrong. It's impossible. You have to
> have the firmware available *before* any processes might need it, but
> at the same time actually loading the firmware may need help from user
> space and/or other devices. It's a chicken-and-egg problem.

Not in the case we have been discussing. There is no chicken-and-egg
dependancy; no processes can need to use the device before the firmware
is loaded because the old device instance has already been unregistered
(and the new one won't be registered until after the firmware is
loaded).

> So let me repeat one more time: Loading firmware at resume time is a
> device driver bug. Seriously. How many times do I have to say it?

It sounds like you were thinking of a situation in which the driver
would try to reload the firmware while keeping the device instance
intact and available the entire time. That isn't what Oliver and I
have been talking about.

On the other hand, something very much like it may crop up in
connection with reset-resume. Usually it doesn't, because a USB device
without firmware generally presents a different set of descriptors from
the ones it presents when the firmware is running. As a result, when
the firmware is lost the device will change identities, and the USB
core will not attempt a reset-resume.

But it is possible that some devices might present the same descriptors
with and without firmware. In such cases it would then be up to the
driver to detect that the firmware was gone and arrange for it to be
reloaded somehow -- before userspace wakes up. In this case your point
is perfectly valid; the driver simply cannot rely on the firmware
loader for this, whether synchronously or not. The only option is for
the driver to cache the firmware. (Or give up entirely and unbind
itself from the device -- but then how would it eventually rebind?)

A similar problem can occur with USB drivers that don't support
suspend/resume at all. Such drivers get unbound during suspend and
then rebound during resume. When the probe routine runs for rebinding,
it may try to reload the firmware -- even though the firmware is most
likely still present. It's fair to say this really is a driver bug.

Alan Stern


2012-01-01 16:32:27

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> Am Sonntag, 1. Januar 2012, 10:54:47 schrieb Marek Vasut:
> > > Am Samstag, 31. Dezember 2011, 19:39:18 schrieb Marek Vasut:
> > > > maybe we should implement such thing into the firmware loader itself?
> > > > Allow it -- for example via some node in /dev -- to force loading
> > > > firmware into some buffer in kernel just before suspend so it'll
> > > > certainly be readily available at resume time?
> > >
> > > This is difficult. We don't know at suspend time which firmware we will
> > > need at resume time.
> >
> > That's not actually true ... you can ask the drivers if they need to load
> > firmware after resume ... that can be implemented and udev can handle it.
> > The driver should know if the hardware it's controlling is stupid or
> > not.
>
> Device IDs can morph due to a power loss. After resumption another driver
> would bind.

What do you mean? I don't think I quite follow.

M

>
> Regards
> Oliver

2012-01-02 03:29:26

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> > > Device IDs can morph due to a power loss. After resumption another
> > > driver would bind.
> >
> > What do you mean? I don't think I quite follow.
>
> Lots of USB hardware appears as one device id when powered on, then when
> the firmware is loaded turns into something else that becomes relevant to
> the user.
>
> So on powerup your wireless card goes
>
> 'I've no idea what I am, feed me firmware for this id"
>
> when it gets the firmware it disconnects, reconnects and says
>
> "I am a USB wireless card with this id"
>
> When you suspend the power gets killed so the device loses its firmware
> and goes back to being a firmware requesting thing on resume.

Ah, even worse then. I wasn't aware of devices like this at all. Mhmmmm ...

>
> Worse still - you don't easily know if the device is in fact new and was
> added while suspended, or was always there.

Yea, that's the problem I outlined in the subsequent email.
>
> So for those devices you do need to load the firmware into them
> automatically after the resume to work out what they are and get the MAC
> to see if its the same wireless card or not.

Sure, but then you'd have to wait for userland to actually be ready to feed
those. Sometimes, you'd like to avoid that -- for example for devices that are
always present. But this solution is stupid as it doesn't solve the whole
problem.

>
> Alan

2012-01-03 00:41:52

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.



Alan Cox <[email protected]> wrote:

>> So we have a table of USB ids that list the devices that need
>firmware drivers. When the uvcvideo driver registers the firmware
>interface with a USB id that matches the list we load the firmware
>driver and pass it a "warm start" event so it can cache the firmware.
>>
>> As long as we can identify the devices that *might* need firmware
>then we are ok.
>
>You just broke the module loading guarantees that stop it all coming
>apart
>in the first place. Probably not in many real world ways but not good.
>
>If we warm booted from Windows into Linux and then suspended then on
>resume we won't have the driver for the firmware loaded present - we
>never saw the 'need firmware' id.

I agree we won't see the need firmware id but we should still be able to identify the device from some of its ids shouldn't we? We could then cache the firmware even though we never loaded it.

>That moves us from a dependancy chain for firmware to one including
>firmware and modules (imagine /lib/modules over iscsi for an iscsi
>adapter needing firmware).
>
>At the moment I don't think anyone has been dumb enough to do firmware
>requiring storage and NFS root over firmware requiring USB devices is
>pretty dumb. I don't however think it's a good hole to dig ?

I'm sure someone will try it at somepoint...

Thanks,

Jack

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

2012-01-03 00:58:14

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> >Hm ... and we can't have memory type that "can be swapped-out, but must
> >be
> >loaded back before suspend" in kernel, right?
>
> Nope. Kernel swapping is a big headache and Linux doesn't do it. The only thing we can do is drop any clean cache pages, but that has performance implications.

You could however have an arrangement where the firmware refmanager does
a list of request_firmware()s at the very start of the suspend process. It
cannot be done during suspend for obvious reasons (sorry I already
suspended the hard disk) but it can be done just before.

In that case with get/put_firmware you'd just need to extend the API to

get_firmware()
put_firmware()
get_for_suspend_firmware()

and a driver might do something like


module_init

fw = get_for_suspend_firmware("wombatdepolariser.fw");


open: /* just optimising */
kref_get(&fw->kref);

close:
put_firmware(fw);


module_unload:
put_firmware(fw);


Now the uglier side of that is how you make sure that if
get_for_suspend_firmware() works that when we suspend the firmware is
present. We could keep the file handle open but that might block unmounts
or we could assume that as its superuser stuff that the distro authors
don't screw up - in which case we'd refuse to suspend with a nasty
complaint if the firmware turned out to have been deleted behind our back.

You'd also have to be in situation where new device probes were not
processed but userspace (eg the fw loader helper) were still running.

That seems to me to be a useful optimisation project for bigger firmwares
but not something to implement on day 1 ?

Alan

2012-01-03 00:18:23

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> You mensioned earlier about not being able to tell the difference
> between a device that needs firmware and one that needs flash (e.g. they
> use exactly the same ids). It doesn't really matter - we just assume
> that it might need firmware and load it anyway. It uses more memory but
> is robust.

We only need to do that for the devices where order and not blocking
matters. There are a few (and some are video) where the firmware sizes is
megabytes, which on an embedded controlling device is not acceptable. I
don't believe any of them are things where simply delaying the
restoration will cause problems however - its video, and DVB and the like
not wireless or serial.

Thankfully most of the firmwares for controllers like wireless are in the
Kb range although this is going to change dramatically as people start
putting things with more brains than an 8051 or AVR into their devices.

And there an awful lot of devices with tiny 8K or so firmware where if we
have get/put firmware nobody is going to *care* if we just cache it on
module load. 40K driver, 16K of buffers, 2K firmware....

> Also, as Linus mensioned, for some devices we just don't care. For
> example, if we drop a video call over suspend it's not ideal but it
> doesn't prevent the system from resuming.

Doesn't work generically though. Some ARM platforms (and Moorestown in the
x86 world) can suspend/resume as a power state. So while it's probably
true for USB it's not generically a property of suspend/resume that you
can assume in any kind of firmware and ordering model.

The world is heading this way more and more. It's moving from the old PC
model of 'user closes lid, clunk for 15 seconds, enter suspend, user
opens lid, churn churn, video, churn clunk. resume' to suspend/resume
being so fast it happens between keystrokes.

(There is a whole other can of pending worms there where 'suspend' isn't
a single state as Linux and ACPI like to pretend. There are devices that
can show the display while the machine is "suspended", devices that can
play back mp3 files while the machine is "suspended" and so on. Imagine
a world where you are listening to music on your phone without
interruption, typing on the display which is showing all the time, and
with the CPU and at the Linux level your device is suspended between
keystrokes)

> Lets concentrate on fixing all the cases that could prevent resume (the "99%")
> and fix the other devices later.

The rules on regressions as stated by Linus are clear - no regressions. So
we need to make it work again, then fix it nicely.

Quite honestly the ugly cases with massive firmware files and the like
really want deferring, and some do that anyway with a userspace loader
for obvious reasons. Request_firmware really isn't very good for
'request bits of firmware one by one' either. Thankfully there are not
many of them.

Alan

2012-01-03 00:20:16

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> No, the uvcvideo driver would need to call out to load the firmware
> loader driver, otherwise you'll never know to load the firmware loader
> driver.

Does that not simply depend which driver has the relevant USB
identifiers ?


2012-01-03 08:28:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.


* Alan Cox <[email protected]> wrote:

> > You mensioned earlier about not being able to tell the
> > difference between a device that needs firmware and one that
> > needs flash (e.g. they use exactly the same ids). It doesn't
> > really matter - we just assume that it might need firmware
> > and load it anyway. It uses more memory but is robust.
>
> We only need to do that for the devices where order and not
> blocking matters. There are a few (and some are video) where
> the firmware sizes is megabytes, which on an embedded
> controlling device is not acceptable. I don't believe any of
> them are things where simply delaying the restoration will
> cause problems however - its video, and DVB and the like not
> wireless or serial.

Here's the size histogram/analysis of all *.fw, *.bin, *.dat,
*.ucode, etc. files in /lib/firmware on a fully populated
distro:

476 firmware blobs total

The toplist:

-rw-r--r--. 1 root root 2786404 Jul 24 2010 ./bcm70012fw.bin
-rw-r--r--. 1 root root 1781048 Feb 9 2011 ./phanfw.bin
-rw-r--r--. 1 root root 864276 Jul 24 2010 ./bcm70015fw.bin
-rw-r--r--. 1 root root 844980 Feb 8 2011 ./asihpi/dsp6200.bin
-rw-r--r--. 1 root root 636980 Feb 8 2011 ./asihpi/dsp6600.bin
-rw-r--r--. 1 root root 627696 Feb 8 2011 ./asihpi/dsp6400.bin
-rw-r--r--. 1 root root 563592 Aug 4 22:04 ./myri10ge_rss_ethp_z8e.dat
-rw-r--r--. 1 root root 553192 Aug 4 22:04 ./myri10ge_rss_eth_z8e.dat
-rw-r--r--. 1 root root 504916 Feb 8 2011 ./asihpi/dsp8900.bin
-rw-r--r--. 1 root root 498128 Sep 7 20:14 ./ct2fw.bin

- 2% of them, i.e. just a tiny fraction is over 512 KB.
- 80% of the firmware blobs are below 100K.
- 50% of them are below 16K.

So loading them into RAM is the obviously right solution.

Those few devices that absolutely want to load the firmware blob
dynamically on some weird low-RAM system can do so *BEFORE*
suspending.

There is nothing that prevents a low-RAM system from loading the
firmware blob in an early suspend callback and making sure it's
there at resume time - and then unloading it from RAM after
resume.

I.e. large blobs can manage their RAM usage just fine - but the
obscenity of the 1% should not control the design and sanity of
the 99% case ...

> [...]
>
> The world is heading this way more and more. It's moving from
> the old PC model of 'user closes lid, clunk for 15 seconds,
> enter suspend, user opens lid, churn churn, video, churn
> clunk. resume' to suspend/resume being so fast it happens
> between keystrokes.

Exactly!

Thanks,

Ingo

2012-01-03 13:20:41

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> That said, I guess there are more complicated / difficult pieces of
> hardware that might not do that. Is that the case here? Can we detect
> that the firmware needs reloading and have the primary driver "yield" to
> the firmware driver?

That doesn't really make any sense in the way USB works.

However if you've got a system to load and pin firmware *on suspend* and
flush it post resume then you don't actually need special 'firmware'
drivers just a slightly fancier lib/firmware that refcounts and handles
load on suspend/delayed flush.

(and probably a sysctl for cache tuning so that typical small firmwares
get cached on bigger boxes anyway for speed)

2012-01-03 15:09:22

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, 3 Jan 2012, Matthew Garrett wrote:

> On Mon, Jan 02, 2012 at 09:45:58PM -0500, Alan Stern wrote:
>
> > Wait a second. Why does isight_firmware bind at this time? Binding to
> > new devices is handled by khubd, which doesn't start running again
> > until after the resume is finished (the device appears to be new
> > because its descriptors have changed). At that point there should be
> > no trouble reloading the firmware.
>
> Then why are we getting this warning? The firmware is only loaded in the
> probe function.

I don't know. Do you have a kernel log (with CONFIG_USB_DEBUG enabled)
posted somewhere?

Alan Stern


2012-01-02 23:00:45

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 02:46:56PM -0800, Linus Torvalds wrote:

> It's *trivial* to attach the firmware driver and load the firmware
> even if the firmware isn't needed - because you know it *will* be
> needed if somebody suspends. Why not just do that? Why make up these
> horrible problems that are totally irrelevant?

It means adding complexity to drivers that don't currently care about
it, and carrying that cost even for hardware that doesn't need it. It
can certainly be made to work, but it's inelegant. We could avoid this
specific instance of the problem by just punting responsibility to
userland instead.

It's clear that we can solve this. All I'm saying is that just making
the firmware loader cache things isn't a solution in itself. In this
specific case, it means merging the isight_firmware driver into
uvcvideo, which is something the uvcvideo maintainer didn't seem keen on
a few years ago.

--
Matthew Garrett | [email protected]

2012-01-02 21:50:38

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 01:27:03PM -0800, Linus Torvalds wrote:

> If we didn't load the firmware before the suspend, then the resume
> function of a device sure as hell had better not load it at resume
> time either.

If the hardware has lost its state then refusing to load the firmware at
resume time isn't going to leave you with a working device.

> And for chrissake, don't bother making it more complicated than it is,
> just for some theoretical hardware or situation that nobody cares
> about.

It's not theoretical hardware. This appears to be the current behaviour
of the isight devices. If you reboot they retain their firmware. If you
suspend, they don't. So if we have a flow like this:

1) user boots from cold. Device comes up with generic USB ID.
2) isight_firmware loads and binds. Firmware is loaded. Device
disconnects and reconnects with an ID that's bound by the UVC driver.
3) user reboots. Device comes up with UVC ID. isight_firmware doesn't
bind.
4) user suspends.
5) user resumes. isight_firmware binds and attempts to load firmware.

then just caching the firmware is inadequate - we had never previously
seen the device on this boot, so we've never loaded it in order to cache
it. isight_firmware could unconditionally load the firmware on module
load just in case a device is plugged in, but that seems even less
elegant than caching it.

Now this is obviously somewhat mitigated because isight_firmware won't
have been autoloaded at (3), so won't be there at (5) unless the user's
manually loaded it. But insmodding a driver shouldn't result in stuff
breaking later.

--
Matthew Garrett | [email protected]

2012-01-02 22:29:28

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 02:19:37PM -0800, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 2:12 PM, Matthew Garrett <[email protected]> wrote:
> >
> > The kernel that loaded the firmware to make the device work is not
> > required to be the same kernel that we're running when we suspend.
>
> Why do you make up all these idiotic theoretical cases that nobody
> cares about and has no relevance what-so-ever for the 99%?

Rebooting is a theoretical case?

> And if you *can* load the firmware in the new kernel, just do it
> before the suspend event. Problem solved.

How do we know to do that? The device appears to be a UVC device at
suspend. It's only at resume that we discover that it's actually a
programmable device.

> And once more: your made-up scenario has *nothing* what-so-ever to do
> with the actual warnings that started this whole thread.

The warnings that started the thread are associated with a driver that I
wrote. I'm explaining to you that your suggestion doesn't work for
a real-world scenario involving that driver. If the firmware has been
loaded and then the user reboots, the new kernel has no indication that
it needs to load the firmware. There are various ways of handling this,
but it's simply untrue that we'll always load the firmware at kernel
init time.

--
Matthew Garrett | [email protected]

2012-01-02 20:55:10

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 20:48, Linus Torvalds wrote:
> On Mon, Jan 2, 2012 at 12:41 PM, Jack Stone <[email protected]> wrote:
>>
>> What about the case where the firmware that needs to be loaded is on the
>> USB device that needs the firmware. This can be resolved at boot using
>> an initrd but at resume time we don't have that.
>
> Quite frankly, caching the firmware will just automagically fix this.
>
> I don't understand why people even bother to talk about USB ID's etc
> changing - that is totally irrelevant. We don't look up firmware by
> USB ID's anyway.

I agree. The only problem I can see is with the lifetime of the
firmware. If we have a generic USB driver that looks after (for example)
all USB disks then we do not want to add the knowedge to it for each
individual vendors firmware.

For example:

1/ Plug in device, registers as a bootstrap device
2/ Load firmware into device (and put firmware in cache)
3/ Device disconnects and reconnects with new USB id
4/ Generic driver takes over as it recognizes the new id

The problem comes with knowing when to put the firmware - how do we tell
the generic driver that that device has firmware that might need to be
freed on disconnect. I don't know enough about USB to know if we can
tell if the same device has reconnected due to firmware being loaded or
if it was simply unplugged and a new device plugged in.

Hope this makes sense,

Jack


2012-01-02 21:23:44

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 1:09 PM, Jack Stone <[email protected]> wrote:
>
> What about USB "class" drivers e.g. usb-storage. They handle any device
> that reports itself as a usb mass storage device. There could be a
> device that needs to be bootstrapped before it becomes a generic usb
> mass storage device. Do we really want to have to write a new driver
> that is almost identical to the generic driver but handles the USB
> firmware correctly.

I'd hope that the generic driver just expose enough interfaces that
you could basically do a "firmware-load" driver that just loads the
firmware and then attaches the device to the generic driver.

So instead of duplicating the driver, or teaching the generic driver
about magic setup stuff, you could do a "specialized" driver for that
particular device (more likely, that manufacturers devices) and then
have some way to hand it over.

Of course, in the end it does matter what the "class" is. For storage
devices, we really do want the kernel to handle things for us
entirely. For other kinds of devices (and the isight camera does come
to mind) I suspect it's entirely fine to just say "whatever, we don't
care enough, just have some udev rules, load things from user space,
and no, you cannot have a videochat active over a suspend/resume
event".

So while I think kernel drivers should generally strive for "you can
suspend/resume without user space even noticing", I do think that it
depends on the particular device just *how* strict you should be about
that rule. Some devices just don't need that kind of attention.

IOW, I think the kernel should *strive* for doing the right thing, and
I think the firmware loader right now makes it unnecessarily hard to
do things right, but at the same time there are no black-and-white
absolute rules. There are always going to be exceptions where somebody
says "this is a crap device, and it's simply not worth worrying
about".

Linus

2012-01-03 07:39:50

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Dienstag, 3. Januar 2012, 01:44:07 schrieb Jack Stone:
> >> Its a perfectly good idea but you still need the full memory
> >requirement
> >> during the suspend.
> >
> >Hm ... and we can't have memory type that "can be swapped-out, but must
> >be
> >loaded back before suspend" in kernel, right?
>
> Nope. Kernel swapping is a big headache and Linux doesn't do it. The only thing we can do is drop any clean cache pages, but that has performance implications.

There is however a notifier chain that is called early enough
to still load stuff from disk so that the additional burden on memory
would be used only when it might be needed.

Regards
Oliver

2012-01-01 20:50:22

by Michael Büsch

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, 1 Jan 2012 20:39:45 +0000
Alan Cox <[email protected]> wrote:

> When you suspend the power gets killed so the device loses its firmware
> and goes back to being a firmware requesting thing on resume.
>
> Worse still - you don't easily know if the device is in fact new and was
> added while suspended, or was always there.
>
> So for those devices you do need to load the firmware into them
> automatically after the resume to work out what they are and get the MAC
> to see if its the same wireless card or not.

Well, that does not prevent you from caching the firmware once you
got it from userspace and keep it until module unload (or probably device
close), so that it is already available on resume.

--
Greetings, Michael.

2012-01-03 13:28:41

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Dienstag, 3. Januar 2012, 14:20:53 schrieb Alan Cox:
> > That said, I guess there are more complicated / difficult pieces of
> > hardware that might not do that. Is that the case here? Can we detect
> > that the firmware needs reloading and have the primary driver "yield" to
> > the firmware driver?
>
> That doesn't really make any sense in the way USB works.
>
> However if you've got a system to load and pin firmware on suspend and
> flush it post resume then you don't actually need special 'firmware'
> drivers just a slightly fancier lib/firmware that refcounts and handles
> load on suspend/delayed flush.

How so? You still need to get the firmware to the device before
you unfreeze user space to get a benefit from this. Having it in
RAM is only half the deal.

Regards
Oliver

2012-01-01 21:26:13

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Sonntag, 1. Januar 2012, 21:30:04 schrieb Linus Torvalds:

> Suspend/resume is *special*, and it's special for a very simple
> reason: unlike bootup or attaching a new device, suspend/resume
> happens WHILE THE USER IS ACTIVE.
>
> Loading firmware at that time is wrong. It's impossible. You have to
> have the firmware available *before* any processes might need it, but
> at the same time actually loading the firmware may need help from user
> space and/or other devices. It's a chicken-and-egg problem.
>
> So let me repeat one more time: Loading firmware at resume time is a
> device driver bug. Seriously. How many times do I have to say it?

We accept that. And if everything goes well, we keep that rule.
The problem we are seeing here is a problem in USB's error
handling. It happens as we already have decided that we are
unable to properly resume the device.

Ideally usbcore would deal with such devices, but at present the USB layer
is unable to cleanly resume devices that change their IDs during
resume.
Solving this would mean putting all firmware loaders into kernel space.
And the mode switching logic as well.

But IMHO resuming a device can always fail, as it involves actual IO,
so we need a functional error handling at that point.

Regards
Oliver

2012-01-01 20:30:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, Jan 1, 2012 at 8:17 AM, Alan Stern <[email protected]> wrote:
>
> As Linus pointed out, the real problem here is the firmware loader.
> The way it is now, a driver can't always depend on the data being
> available, even during a normal boot. ?It ought to use an asynchronous
> approach; then none of these problems would arise.

WRONG.

Alan, you're not getting it. Loading firmware as part of
suspend/resume is WRONG.

It would not be made any better by an asynchronous approach, and that
isn't the problem with request_firmware().

Suspend/resume is *special*, and it's special for a very simple
reason: unlike bootup or attaching a new device, suspend/resume
happens WHILE THE USER IS ACTIVE.

Loading firmware at that time is wrong. It's impossible. You have to
have the firmware available *before* any processes might need it, but
at the same time actually loading the firmware may need help from user
space and/or other devices. It's a chicken-and-egg problem.

So let me repeat one more time: Loading firmware at resume time is a
device driver bug. Seriously. How many times do I have to say it?

And making it asynchronous doesn't make it *any* less of a bug. It
doesn't change anything but timing, but the problem was never about
timing in the first place! The problem was about dependencies. User
space may well depend on the device, and *other* devices may well
depend on the device working.

So stop with the inanity. I've already told people what the fix is:
make sure that the firmware image is in memory before the suspend ever
happens, and just reload it from memory. NOT with
"request_firmware()". Because requesting the firmware at resume time
is buggy and wrong, and has nothing to do with "asynchronous" or
"synchronous". It has everything to do with "it's buggy".

Really really really.

So the problem with request_firmware() has absolutely nothing to do
with "asynchronous". The problem is that the firmware interfaces do
not cache the data in memory, and currently *cannot* sanely cache the
firmware data simply because the interface doesn't have any kind of
lifetime rules.

In other words: we could make "request_firmware()" cache *all*
firmware images forever, but there is currently no way to say "ok, the
device was unplugged - and not fakily so by a resume event, but for
real, and physically - so now you can drop the cache". Which means
that effectively request_firmware() can do no caching at all, because
it might eventually just run out of memory.

It is *possible* that we might tie the firmware lifetime rules to the
driver module lifetime. But it would probably be much better if we
made for an explicit model of "ok, the device is now really gone" so
that it would work properly with compiled-in drivers etc too.

And yes, the firmware would have to stay around even around a
resume/suspend that causes unplug events over USB. The "use USB and
lose power" actually happens also for built-in devices that may well
be disks and network cards that are *needed* by user space, so even if
the device has been electrically unplugged, it is still attached and
needs to be brought back *before* user space comes back.

That's the whole point of suspend/resume: we're not starting from a
clean slate. We are supposed to continue as if nothing happened!

Linus

2012-01-02 21:52:25

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> On 02/01/12 21:23, Linus Torvalds wrote:
> > On Mon, Jan 2, 2012 at 1:09 PM, Jack Stone <[email protected]> wrote:
> >> What about USB "class" drivers e.g. usb-storage. They handle any device
> >> that reports itself as a usb mass storage device. There could be a
> >> device that needs to be bootstrapped before it becomes a generic usb
> >> mass storage device. Do we really want to have to write a new driver
> >> that is almost identical to the generic driver but handles the USB
> >> firmware correctly.
> >
> > I'd hope that the generic driver just expose enough interfaces that
> > you could basically do a "firmware-load" driver that just loads the
> > firmware and then attaches the device to the generic driver.
>
> Sounds workable.
>
> To make the firmware caching easier I would propose one extra function
> in addition to the aforemensioned get_firmware / put_firmware - a
> find_firmware function to search the cache and return the appropriate
> firmware blob. It should only be called if the caller already has a
> refcount to the firmware, it's only use is to save every driver saving a
> pointer to the firmware.
>
> If noone beats me to it I will try and put together an RFC for a new
> version.

The problem is on systems with limited resources, most notably RAM. If you plug
in many devices at once, many firmwares will be cached at one point, efectivelly
doing DoS attach on the machine?

Also, how will this solve the suspect-resume issue? What if the device suspends
only occasionally -- like every 24 hours -- then you'd have the FW cached all
the time too?

M
>
> Thanks,
>
> Jack
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-01-03 00:42:03

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> Can we tell by the USB ids that this is a device that might need
> firmware and load it into memory anyway - it would use more memory but
> would always work as we would have the firmware.

Only by wasting a lot of memory in some cases. For the majority of cases
however yes you can take a good guess that id X might need firmware Y and
the firmware will only be a few K.

In that case however you don't want some generic firmware module knowing
all this crap, your driver can just request_firmware() the stuff as
modprobe and free it up on the module unload. For a typical 8bit firmware
of a few K you'll free a ton more memory unloading the module than the
firmware ! That I think actually covers the majority of devices under
discussion.

For the giant-firmware cases, it's unworkable on small devices - but they
are special cases anyway and almost entirely video - so not currently
ones that cause a problem unless you are doing RFC4824 in which case your
nearest asylum should be consulted.

Alan

2012-01-02 21:27:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 1:19 PM, Matthew Garrett <[email protected]> wrote:
> On Mon, Jan 02, 2012 at 12:48:48PM -0800, Linus Torvalds wrote:
>
>> Why are you guys making it any more complicated than that?
>
> Because it's inadequate. You can't guarantee that we ever loaded
> firmware.

If we didn't load the firmware before the suspend, then the resume
function of a device sure as hell had better not load it at resume
time either.

And don't make the stupid argument that we don't know. That's just
inane. Either the driver loads the firmware at startup, or it doesn't.
If it loads it at startup, the firmware will have been loaded. If it
loads it only at "open" time or similar (and the device wasn't
opened), then the resume had better *know* about that, and not try to
load the firmware of a device that wasn't open!

And for chrissake, don't bother making it more complicated than it is,
just for some theoretical hardware or situation that nobody cares
about.

Go back to the original report of the *actual* problems we have. They
are not some complex case where things magically change. The REAL
ACTUAL problems that people have are for very straightforward
situations where we simply do the wrong thing, largely because our
firmware loading is full of crap.

Fix the 99%. Screw the crazy shit, don't even bother worrying about it
until *after* the 99% is fixed.

Linus

2012-01-03 00:46:38

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, 3 Jan 2012 00:38:31 +0000
Matthew Garrett <[email protected]> wrote:

> On Tue, Jan 03, 2012 at 12:22:58AM +0000, Jack Stone wrote:
>
> > So we have a table of USB ids that list the devices that need firmware drivers. When the uvcvideo driver registers the firmware interface with a USB id that matches the list we load the firmware driver and pass it a "warm start" event so it can cache the firmware.
> >
> > As long as we can identify the devices that *might* need firmware then we are ok.
>
> We could do all of this. Or we could just push the responsibility for
> this specific case back to userspace, and maybe that's what I should
> have done in the first place.

My guess is this

For small firmware and most devices we ought to just grab the firmware on
module load, drop it on module unload. Whoopee its 8K or so.

For the really big horrors where we can be fairly sure its not part of a
critical path like big DVB firmware or 3G modems like qcserial then user
space.

and pray there are none that are large and problematic (eg iscsi, wireless
or some other similar sort of suspect)


Going back to loading the small firmwares at module load is exactly the
same as when they were compiled in, which didn't seem to cause much pain.

Alan

2012-01-01 03:49:14

by Larry Finger

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 12/30/2011 06:22 PM, Linus Torvalds wrote:
> On Fri, Dec 30, 2011 at 3:54 PM, Dave Jones<[email protected]> wrote:
>> We're getting a bunch of reports against Fedora 16
>> (still using 3.1) which look like some drivers are trying to
>> load firmware on resume from suspend, while usermodehelper
>> is disabled.
>
> Ok, buggy drivers. You *must*not* load firmware in your resume path,
> since there is no actual guarantee that any particular device will be
> resumed after the disk that contains the firmware images.
>
> So it's very simple: drivers that load firmware at resume time are
> buggy. No ifs, buts, or maybes about it.
>
>> Here are some example traces:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=746411
>
> It's isight_firmware_load(), in the isight_firmware driver. The driver
> doesn't actually do anything but load the firmware, and is apparently
> not very good at that either.
>
> It should either fake a disconnect and reconnect of the device (and
> let the reconnect then load the firmware through udev or something) or
> it should just save the firmware image in memory from the original
> load, and make the resume just re-initialize it - not load it.
>
> It's also possible that it could be considered a USB layer bug, and
> the USB layer should just not rebind the devices directly in the
> resume function, but do it somehow later. HOWEVER, that would only
> work for "random" USB devices that aren't in use by user space (like
> disks etc might be). So I think that in general the real solution is
> always just "make sure that the firmware is in memory before the
> suspend even happens".
>
> Greg - has the USB resume logic been changed lately?
>
> Matthew? Any comments about that particular driver?
>
>> https://bugzilla.redhat.com/show_bug.cgi?id=771002
>
> Same issue, different driver. Again, it's USB, and it's possible that
> USB just makes it really hard to do this correctly (ie the "save
> firmware image across suspend so that you don't have to load it at
> resume time").
>
> It's also possible that we should blame the firmware code, which is
> expressly written to encourage these kinds of bugs. It may be that i
> tshould be the firmware code that has a "get_firmware()" +
> "put_firmware()" model, and it should cache the firmware explicitly if
> the config supports suspend, so that a firmware read at resume time
> would actually work. The whole "request_firmware()" interface really
> is very prone to these kinds of bugs.
>
> But it's possible this could be fixed at the driver level by doing the
> caching there.
>
> In this case it's the rtl8192cu driver, so Larry, Chaoming, John etc
> added to the cc for that one.

A patch that delays the firmware reading until the upload is needed when the
interface is brought up will be sent to John very soon. My box loses its video
when a suspend/resume cycle is done, but the WARNING is no longer printed by
rtl8192cu.

Larry



2012-01-03 11:49:26

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Dienstag, 3. Januar 2012, 06:53:46 schrieb Linus Torvalds:
> On Mon, Jan 2, 2012 at 7:25 PM, Matthew Garrett <[email protected]> wrote:
> > On Mon, Jan 02, 2012 at 09:45:58PM -0500, Alan Stern wrote:
> >
> >> Wait a second. Why does isight_firmware bind at this time? Binding to
> >> new devices is handled by khubd, which doesn't start running again
> >> until after the resume is finished (the device appears to be new
> >> because its descriptors have changed). At that point there should be
> >> no trouble reloading the firmware.
> >
> > Then why are we getting this warning? The firmware is only loaded in the
> > probe function.
>
> The USB suspend/resume function does that "unbind/rebind" dance, and
> that causes a "device_attach()". Which causes a probe() to be called.
>
> Should it do that? I think not, not if the ID's haven't changed. But I
> don't know the USB layer all that well.

The ID has changed. That is why we run into this problem.
And indeed the USB layer does the rebind only if either

a) ID has changed
b) the driver tells us that it cannot deal with a device that has lost its state

> The whole USB suspend/resume code is also very confused in general.
> See for example commit c043f1245654 ("USB: unbind all interfaces
> before rebinding them") which talks about how the USB layer needs to
> unbind all interfaces in order to rebind them later. But then you
> look at the *code*, and it actually does a DO_REBIND, not an unbind!

Understood. Working on that.

Regards
Oliver

2012-01-02 22:03:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 1:50 PM, Matthew Garrett <[email protected]> wrote:
> On Mon, Jan 02, 2012 at 01:27:03PM -0800, Linus Torvalds wrote:
>
>> If we didn't load the firmware before the suspend, then the resume
>> function of a device sure as hell had better not load it at resume
>> time either.
>
> If the hardware has lost its state then refusing to load the firmware at
> resume time isn't going to leave you with a working device.

What the heck is your problem?

Go back and read it.

If it wasn't loaded before, THEN IT WASN'T WORKING BEFORE EITHER! And
if it was loaded before, then it would be cached (and thus trivially
reloaded without having to invoke user-space or disk devices that may
not be running) and thus loading it would work.

So regardless of the state before, it would resume "correctly" - in
the same state it was suspended. In other words: a caching firmware
loader would have done EXACTLY THE RIGHT THING.

Why the hell do you keep on harping on idiotic issues? Stop being a
moron, just repeat after me:

A caching firmware loader fixes all these issues

and is simple to boot. Stop the idiotic blathering already.

Linus

2012-01-01 02:21:46

by Alan Stern

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sat, 31 Dec 2011, Oliver Neukum wrote:

> Am Samstag, 31. Dezember 2011, 16:33:12 schrieb Alan Stern:
>
> > Nothing has changed in the USB layer -- it has always been true that
> > devices could be reset during a suspend/resume cycle. This isn't a
> > matter of how the stack is written or anything like that; some
> > motherboards simply do not provide suspend power to their USB
> > controllers. Or the firmware reinitializes the controllers and
> > attached devices during resume, forcing Linux's USB core to reset
> > every device on the affected buses.
> >
> > When it comes to suspend/resume, there are almost no guarantees. :-(
>
> We are definitely going through do_unbind_rebind(). But I don't think
> it matters why we got there. We seem to be calling probe() too early.
> And we need to guarantee that a driver can request firmware in probe()
>
> So something has changed in the resume code path further up.

For at least a year and a half, it has been true that rebinding takes
place during the complete phase of system resume. Clearly that is too
early to load firmware. When do you think we should do it instead?
And how should we keep track of which interfaces need rebinding?

Alan Stern

P.S.: Oliver, it looks like there's a bunch of dead code in
usb_suspend_interface() and usb_resume_interface(). I don't see how
either one can be executed with a driver that doesn't have both suspend
and resume methods. Such cases should be filtered out when
usb_suspend() calls do_unbind_rebind(). The only odd thing that can
happen is when a driver doesn't support reset-resume. Do you agree?


2012-01-02 05:35:29

by Gábor Stefanik

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 5:17 AM, Marek Vasut <[email protected]> wrote:
>
> FYI this is not only problem with USB, but with PCMCIA too for example.
>
> M

Also, PCI Express. Some Nvidia graphics cards only assume their real
PCI device ID during video POST (which can happen either as part of
system POST, if the card is primary, or during Linux startup/resume,
for a secondary card).

--
Vista: [V]iruses, [I]ntruders, [S]pyware, [T]rojans and [A]dware. :-)

2012-01-02 22:47:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 2, 2012 at 2:29 PM, Matthew Garrett <[email protected]> wrote:
>>
>> Why do you make up all these idiotic theoretical cases that nobody
>> cares about and has no relevance what-so-ever for the 99%?
>
> Rebooting is a theoretical case?

Your forcing the issue to be some crazy one that "cannot be solved" is
the theoretical case. I'm telling you that "reloading the firmware at
resume time" is wrong, and trivially solved by loading it before
resume time. You then make up all these invalid arguments about why
it's somehow magically impossible, but you *literally* do it by
refusing to look at what *is* possible.

It's *trivial* to attach the firmware driver and load the firmware
even if the firmware isn't needed - because you know it *will* be
needed if somebody suspends. Why not just do that? Why make up these
horrible problems that are totally irrelevant?

You can match the driver by various things like DMI strings etc, if
you really don't have a USB (or PCI or whatever) ID to match against.
But in fact, almost always you at least have things like subvendor
ID's etc that you *can* match against.

So why are you actively trying to make something complicated that
isn't? Why are you arguing against the simple solution that can easily
be made to work for the odd and unusual case you are trying to push?

Linus

2012-01-03 00:38:36

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Tue, Jan 03, 2012 at 12:22:58AM +0000, Jack Stone wrote:

> So we have a table of USB ids that list the devices that need firmware drivers. When the uvcvideo driver registers the firmware interface with a USB id that matches the list we load the firmware driver and pass it a "warm start" event so it can cache the firmware.
>
> As long as we can identify the devices that *might* need firmware then we are ok.

We could do all of this. Or we could just push the responsibility for
this specific case back to userspace, and maybe that's what I should
have done in the first place.

--
Matthew Garrett | [email protected]

2012-01-02 21:57:30

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 21:52, Marek Vasut wrote:
>> On 02/01/12 21:23, Linus Torvalds wrote:
>>> On Mon, Jan 2, 2012 at 1:09 PM, Jack Stone <[email protected]> wrote:
>>>> What about USB "class" drivers e.g. usb-storage. They handle any device
>>>> that reports itself as a usb mass storage device. There could be a
>>>> device that needs to be bootstrapped before it becomes a generic usb
>>>> mass storage device. Do we really want to have to write a new driver
>>>> that is almost identical to the generic driver but handles the USB
>>>> firmware correctly.
>>>
>>> I'd hope that the generic driver just expose enough interfaces that
>>> you could basically do a "firmware-load" driver that just loads the
>>> firmware and then attaches the device to the generic driver.
>>
>> Sounds workable.
>>
>> To make the firmware caching easier I would propose one extra function
>> in addition to the aforemensioned get_firmware / put_firmware - a
>> find_firmware function to search the cache and return the appropriate
>> firmware blob. It should only be called if the caller already has a
>> refcount to the firmware, it's only use is to save every driver saving a
>> pointer to the firmware.
>>
>> If noone beats me to it I will try and put together an RFC for a new
>> version.
>
> The problem is on systems with limited resources, most notably RAM. If you plug
> in many devices at once, many firmwares will be cached at one point, efectivelly
> doing DoS attach on the machine?
>
> Also, how will this solve the suspect-resume issue? What if the device suspends
> only occasionally -- like every 24 hours -- then you'd have the FW cached all
> the time too?

Yes, at least to begin with. If we can come up with a robust scheme
which doesn't require the firmware to be kept in memory then that would
also be workable.

For example, drivers which know they don't ever need the firmware again
wouldn't need to cache it. That would probably be quite a small list -
there are systems that cut power to USB devices over suspend and so
those devices would need the firmware reloading.

I don't think there is anyway to avoid the memory requirement if we want
to be able to resume transparently to user-space (or even resume at all
in some setups).

Thanks,

Jack


2012-01-01 22:45:19

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 31/12/11 00:22, Linus Torvalds wrote:
> It's also possible that we should blame the firmware code, which is
> expressly written to encourage these kinds of bugs. It may be that i
> tshould be the firmware code that has a "get_firmware()" +
> "put_firmware()" model, and it should cache the firmware explicitly if
> the config supports suspend, so that a firmware read at resume time
> would actually work. The whole "request_firmware()" interface really
> is very prone to these kinds of bugs.

How about these semantics for get_firmware / put_firmware:

get_firmware -> checks for a cached copy of the firmware and returns
that if it exists otherwise calls _get_firmware (i.e.
_request_firmware). Either way we get a refcount to the firmware blob.
This requires adding 3 members to struct firmware: the refcount, the
name of the firmware (see @name in request_firmware) and a list head to
cache with. We could also add and async version of this to match
request_firmware_nowait.

put_firmware -> Removes a refcount from the firmware and frees if there
are no more users.

put_firmware is called when we are sure that we no longer need to reload
the firmware for that device - device close time for most drivers.

USB is a more difficult case as devices can change ID but the firmware
blob still needs to be associated with them. I don't know USB at all so
I have no idea if we can track this. Some PCI devices may also suffer
the same problem.

Hope this is useful,

Jack

2012-01-01 12:27:01

by Oliver Neukum

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

Am Sonntag, 1. Januar 2012, 10:54:47 schrieb Marek Vasut:
> > Am Samstag, 31. Dezember 2011, 19:39:18 schrieb Marek Vasut:
> > > maybe we should implement such thing into the firmware loader itself?
> > > Allow it -- for example via some node in /dev -- to force loading
> > > firmware into some buffer in kernel just before suspend so it'll
> > > certainly be readily available at resume time?
> >
> > This is difficult. We don't know at suspend time which firmware we will
> > need at resume time.
>
> That's not actually true ... you can ask the drivers if they need to load
> firmware after resume ... that can be implemented and udev can handle it. The
> driver should know if the hardware it's controlling is stupid or not.

Device IDs can morph due to a power loss. After resumption another driver
would bind.

Regards
Oliver

2012-01-02 23:32:03

by Jack Stone

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On 02/01/12 23:00, Matthew Garrett wrote:
> On Mon, Jan 02, 2012 at 02:46:56PM -0800, Linus Torvalds wrote:
>
>> It's *trivial* to attach the firmware driver and load the firmware
>> even if the firmware isn't needed - because you know it *will* be
>> needed if somebody suspends. Why not just do that? Why make up these
>> horrible problems that are totally irrelevant?
>
> It means adding complexity to drivers that don't currently care about
> it, and carrying that cost even for hardware that doesn't need it. It
> can certainly be made to work, but it's inelegant. We could avoid this
> specific instance of the problem by just punting responsibility to
> userland instead.
>
> It's clear that we can solve this. All I'm saying is that just making
> the firmware loader cache things isn't a solution in itself. In this
> specific case, it means merging the isight_firmware driver into
> uvcvideo, which is something the uvcvideo maintainer didn't seem keen on
> a few years ago.
>

Rather than merging the drivers we could make the uvcvideo (or any other
generic driver) driver expose a "subdriver" interface that allows
another module to bind to the specific ID and add extra handling to
certain events. We could then use that interface to provide a firmware
driver for each device that needs it. The firmware driver could bind to
the bootstrap device and to the uvcvideo subdriver interface and it
would then have all the lifetime info we need. [Credit to Linus for
suggesting this idea]

You mensioned earlier about not being able to tell the difference
between a device that needs firmware and one that needs flash (e.g. they
use exactly the same ids). It doesn't really matter - we just assume
that it might need firmware and load it anyway. It uses more memory but
is robust.

Also, as Linus mensioned, for some devices we just don't care. For
example, if we drop a video call over suspend it's not ideal but it
doesn't prevent the system from resuming. Lets concentrate on fixing all
the cases that could prevent resume (the "99%") and fix the other
devices later.

Thanks,

Jack

2012-01-03 00:34:38

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> To make the firmware caching easier I would propose one extra function
> in addition to the aforemensioned get_firmware / put_firmware - a
> find_firmware function to search the cache and return the appropriate
> firmware blob.

If the caller has a recount they have a pointer. If the caller can't work
out how to assign a 32bit value to a field in a struct please get them to
work on something less important than kernel code.

Alan

2012-01-03 07:17:57

by Marek Vasut

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> > >Hm ... and we can't have memory type that "can be swapped-out, but must
> > >be
> > >loaded back before suspend" in kernel, right?
> >
> > Nope. Kernel swapping is a big headache and Linux doesn't do it. The only
> > thing we can do is drop any clean cache pages, but that has performance
> > implications.
>
> You could however have an arrangement where the firmware refmanager does
> a list of request_firmware()s at the very start of the suspend process. It
> cannot be done during suspend for obvious reasons (sorry I already
> suspended the hard disk) but it can be done just before.

Yep, that sounds sane.
>
> In that case with get/put_firmware you'd just need to extend the API to
>
> get_firmware()
> put_firmware()
> get_for_suspend_firmware()
>
> and a driver might do something like
>
>
> module_init
>
> fw = get_for_suspend_firmware("wombatdepolariser.fw");
>
>
> open: /* just optimising */
> kref_get(&fw->kref);
>
> close:
> put_firmware(fw);
>
>
> module_unload:
> put_firmware(fw);
>
>
> Now the uglier side of that is how you make sure that if
> get_for_suspend_firmware() works that when we suspend the firmware is
> present. We could keep the file handle open but that might block unmounts

Nope, not like this.

> or we could assume that as its superuser stuff that the distro authors
> don't screw up - in which case we'd refuse to suspend with a nasty
> complaint if the firmware turned out to have been deleted behind our back.

No, we should just emit a warning after resume (or even before suspend) with
something like "Couldn't reload firmware, you are an idiot for deleting it, your
hardware won't work anymore, sorry".

>
> You'd also have to be in situation where new device probes were not
> processed but userspace (eg the fw loader helper) were still running.

Yes, I think it'd help not only usb to have a state of userland where fw loader
and a few similar things would still be operational. Aka. we'd need a way for
the kernel to tell userland part of FW loader "ok, now we're going zzz, do your
job quickly" and bump it's priority to highest (so other processes won't
interfere). Then wait till fw loader confirms that all FW was passed to the
kernel and suspend.

>
> That seems to me to be a useful optimisation project for bigger firmwares
> but not something to implement on day 1 ?

For me on PalmTX, with 32MB of RAM, every kb is good. The wifi firmware there
has 120kb or so, so I'd like to avoid having it in kernel all the time. But that
might be a bit of an extreme case.

>
> Alan

M

2012-01-02 23:50:39

by Alan

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

> So the only thing we need is to fix the braindamaged firmware
> interfaces. We need caching, and we need to replace (or at least
> extend) "load_firware()" with "[get|put]_firmware()".

So at what moment after a resume does a firmware that was in use with a
device go away and what do we do about devices inserted while suspended ?

Remembering that

- there is no moment when USB says 'we have finished re-enumeration'.
There is no point at which you can say 'we have finished
re-enumeration, we are now starting on enumeration of new stuff'. New
stuff wanting firmware will pop up on resume mixed in with old stuff.

- we don't seem to have any device or bus infrastructure for doing a
'post resume' purge of such data if we could decide when that moment
is ?

Alan

2012-01-03 03:25:17

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Mon, Jan 02, 2012 at 09:45:58PM -0500, Alan Stern wrote:

> Wait a second. Why does isight_firmware bind at this time? Binding to
> new devices is handled by khubd, which doesn't start running again
> until after the resume is finished (the device appears to be new
> because its descriptors have changed). At that point there should be
> no trouble reloading the firmware.

Then why are we getting this warning? The firmware is only loaded in the
probe function.

--
Matthew Garrett | [email protected]

2012-01-02 03:33:37

by Matthew Garrett

[permalink] [raw]
Subject: Re: loading firmware while usermodehelper disabled.

On Sun, Jan 01, 2012 at 10:39:13PM +0100, Michael B?sch wrote:

> I don't get it. Why would a device enter a state after resume, that it was not
> in at _any_ time before the machine was suspended?

Load firmware. Do a warm reboot. The hardware may maintain the
initialised state, but the new kernel will never have loaded it.

--
Matthew Garrett | [email protected]