2011-06-13 15:11:02

by Alan Stern

[permalink] [raw]
Subject: Unbinding drivers for resources that are in use

The kernel prevents modules from being unloaded if they are being used.
But it doesn't have any analogous mechanism for preventing a driver
being unbound from a device that's in use.

For example, suppose a SATA disk contains a mounted filesystem. If the
user writes the corresponding device name to
/sys/bus/scsi/drivers/sd/unbind without unmounting the filesystem, the
drive will become inaccessible and data may be lost. The same problem
arises with USB devices and programs using usbfs to unbind a device
from its kernel driver.

It's true that the "unbind" attribute has mode 0200 and therefore can
be written only by the superuser. Still, this puts the onus on
userspace to determine whether or not a device is being used. The
kernel could easily keep track of this automatically and atomically
-- userspace can't do this without races.

Therefore I'm asking if the driver core should add a refcount to every
struct device for keeping track of the number of open file references
(or other types of resource) using this device. If this number is
nonzero, the kernel should prevent the device from being unbound from
its driver -- except of course in cases where the device has been
hot-unplugged; there's nothing we can do to prevent errors when this
happens.

Changes to the refcount would have to propagate up the device tree: If
a device holds an important resource then we don't want any of the
device's ancestors to become inaccessible either. This would be easy
to implement.

Should we do it?

Alan Stern


2011-06-13 15:42:34

by Greg KH

[permalink] [raw]
Subject: Re: Unbinding drivers for resources that are in use

On Mon, Jun 13, 2011 at 11:10:57AM -0400, Alan Stern wrote:
> The kernel prevents modules from being unloaded if they are being used.
> But it doesn't have any analogous mechanism for preventing a driver
> being unbound from a device that's in use.
>
> For example, suppose a SATA disk contains a mounted filesystem. If the
> user writes the corresponding device name to
> /sys/bus/scsi/drivers/sd/unbind without unmounting the filesystem, the
> drive will become inaccessible and data may be lost. The same problem
> arises with USB devices and programs using usbfs to unbind a device
> from its kernel driver.
>
> It's true that the "unbind" attribute has mode 0200 and therefore can
> be written only by the superuser. Still, this puts the onus on
> userspace to determine whether or not a device is being used. The
> kernel could easily keep track of this automatically and atomically
> -- userspace can't do this without races.
>
> Therefore I'm asking if the driver core should add a refcount to every
> struct device for keeping track of the number of open file references
> (or other types of resource) using this device. If this number is
> nonzero, the kernel should prevent the device from being unbound from
> its driver -- except of course in cases where the device has been
> hot-unplugged; there's nothing we can do to prevent errors when this
> happens.
>
> Changes to the refcount would have to propagate up the device tree: If
> a device holds an important resource then we don't want any of the
> device's ancestors to become inaccessible either. This would be easy
> to implement.
>
> Should we do it?

No, people are starting to use 'unbind' as a poor-man's verison of
revoke(), by simulating the device removal from the driver, even if the
device is being used by someone at that point in time.

And that's a good thing, as that is what revoke() really wants to do,
you want to clean up whatever that device was doing and make the file
handles stale, and allow a different user to then connect to the device
if needed.

So I really would not want to disallow this type of functionality, which
adding reference counts and preventing unbind from working would cause.

thanks,

greg k-h

2011-06-13 17:49:45

by Hans de Goede

[permalink] [raw]
Subject: Re: Unbinding drivers for resources that are in use

Hi,

On 06/13/2011 05:42 PM, Greg KH wrote:
> On Mon, Jun 13, 2011 at 11:10:57AM -0400, Alan Stern wrote:
>> The kernel prevents modules from being unloaded if they are being used.
>> But it doesn't have any analogous mechanism for preventing a driver
>> being unbound from a device that's in use.
>>
>> For example, suppose a SATA disk contains a mounted filesystem. If the
>> user writes the corresponding device name to
>> /sys/bus/scsi/drivers/sd/unbind without unmounting the filesystem, the
>> drive will become inaccessible and data may be lost. The same problem
>> arises with USB devices and programs using usbfs to unbind a device
>> from its kernel driver.
>>
>> It's true that the "unbind" attribute has mode 0200 and therefore can
>> be written only by the superuser. Still, this puts the onus on
>> userspace to determine whether or not a device is being used. The
>> kernel could easily keep track of this automatically and atomically
>> -- userspace can't do this without races.
>>
>> Therefore I'm asking if the driver core should add a refcount to every
>> struct device for keeping track of the number of open file references
>> (or other types of resource) using this device. If this number is
>> nonzero, the kernel should prevent the device from being unbound from
>> its driver -- except of course in cases where the device has been
>> hot-unplugged; there's nothing we can do to prevent errors when this
>> happens.
>>
>> Changes to the refcount would have to propagate up the device tree: If
>> a device holds an important resource then we don't want any of the
>> device's ancestors to become inaccessible either. This would be easy
>> to implement.
>>
>> Should we do it?
>
> No, people are starting to use 'unbind' as a poor-man's verison of
> revoke(), by simulating the device removal from the driver, even if the
> device is being used by someone at that point in time.
>
> And that's a good thing, as that is what revoke() really wants to do,
> you want to clean up whatever that device was doing and make the file
> handles stale, and allow a different user to then connect to the device
> if needed.
>
> So I really would not want to disallow this type of functionality, which
> adding reference counts and preventing unbind from working would cause.

Allow me to clarify things a bit. Alan's mail is based on a previous
discussion on the usb-list. What I suggested there is to not change the
unbind semantics, but instead add a try_unbind or some such function
which would allow userspace to request an unbind, and only have it
success if the device is not in use, this would still require some sort
of "device in use" tracking, but that would not block the current
existing unbind.

I have a in my mind very clear cut use case for this, redirection of
usb devices from the host to a vm, this is currently supported by
at least vmware, virtualbox and qemu(-kvm). Currently these vm
providers do usb redirection by simply unbinding the current driver,
in case of vmware and virtualbox the user can do this with a single
click.

However this is not always a good thing, if the usb device in question
is a storage devices and writes are still pending (or some app has files
open on the mount of the device), the IMHO correct thing to happen
would be for the user to a get a "Sorry the device is busy" dialog
box rather then getting potential fs corruption / a crashing app.

Likewise if the usb device is a printer a printjob is currently being
spooled, we don't want the usblp driver to get unbind halfway through
the job, etc.

My initial proposal was to add a new usbfs_try_disconnect ioctl for
interfaces, and a new usb driver callback for this, which then for
example the usb-storage driver could implement.

Alan correctly pointed out that adding a driver callback to the
usb mass storage driver which checks if disconnecting is ok, is
currently not possible, because there is no such thing as device
busy tracking. There is module busy tracking, but that only tracks
if of any usb-storage linked disks are mounted, not a single
device.

Regards,

Hans

2011-06-13 19:15:52

by Greg KH

[permalink] [raw]
Subject: Re: Unbinding drivers for resources that are in use

On Mon, Jun 13, 2011 at 07:50:11PM +0200, Hans de Goede wrote:
> Allow me to clarify things a bit. Alan's mail is based on a previous
> discussion on the usb-list. What I suggested there is to not change the
> unbind semantics, but instead add a try_unbind or some such function
> which would allow userspace to request an unbind, and only have it
> success if the device is not in use, this would still require some sort
> of "device in use" tracking, but that would not block the current
> existing unbind.

Yes, I was part of that discussion and remember it well. I still don't
think this is a good idea :)

> I have a in my mind very clear cut use case for this, redirection of
> usb devices from the host to a vm, this is currently supported by
> at least vmware, virtualbox and qemu(-kvm). Currently these vm
> providers do usb redirection by simply unbinding the current driver,
> in case of vmware and virtualbox the user can do this with a single
> click.

Understood.

> However this is not always a good thing, if the usb device in question
> is a storage devices and writes are still pending (or some app has files
> open on the mount of the device), the IMHO correct thing to happen
> would be for the user to a get a "Sorry the device is busy" dialog
> box rather then getting potential fs corruption / a crashing app.

And they can get just such a dialog box if the userspace program decides
to actually check to see if the device is in use before it does this.
There is nothing keeping that from happening today, why do we need to
rework the kernel internals to provide this right now?

> Likewise if the usb device is a printer a printjob is currently being
> spooled, we don't want the usblp driver to get unbind halfway through
> the job, etc.

True, but remember, the printer spooler always keeps the device node
open, so you never know if a job is being sent or not.

And even if you did know, what's to keep it from starting up inbetween
asking if it is idle and doing the disconnect if the device node is
always open? Same goes for sound devices and lots of other types of
devices.

> My initial proposal was to add a new usbfs_try_disconnect ioctl for
> interfaces, and a new usb driver callback for this, which then for
> example the usb-storage driver could implement.
>
> Alan correctly pointed out that adding a driver callback to the
> usb mass storage driver which checks if disconnecting is ok, is
> currently not possible, because there is no such thing as device
> busy tracking. There is module busy tracking, but that only tracks
> if of any usb-storage linked disks are mounted, not a single
> device.

Yeah, and I think you would get lost down in the scsi and block code if
you were to try to implement something like this within the kernel.

Again, userspace knows all of this information today, and can decide to
unbind the device or not based on it. Why does the kernel have to be
the one tracking all of it? Especially as the policy will depend on the
different type of device to be able to determine if it is safe to unbind
or not.

thanks,

greg k-h

2011-06-14 07:34:05

by Oliver Neukum

[permalink] [raw]
Subject: Re: Unbinding drivers for resources that are in use

Am Montag, 13. Juni 2011, 17:10:57 schrieb Alan Stern:
> Therefore I'm asking if the driver core should add a refcount to every
> struct device for keeping track of the number of open file references
> (or other types of resource) using this device. If this number is
> nonzero, the kernel should prevent the device from being unbound from
> its driver -- except of course in cases where the device has been
> hot-unplugged; there's nothing we can do to prevent errors when this
> happens.
>

Firstly, the user may want to unbind a driver for a device that is in use.

Secondly, the driver doesn't know in the general case. You've given the best example
yourself. A driver certainly must not know about mounted filesystems. Things
get really hairy if you consider i-scsi and related stuff.

So I'd say it would be major work for an additional feature that doesn't
help in the case that hurts most.

Now, if you are looking for a quick and dirty solution, you could export the
pm counters and provide an ioctl for unbind if zero.

Regards
Oliver
--
- - -
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imend?rffer, HRB 16746 (AG N?rnberg)
Maxfeldstra?e 5
90409 N?rnberg
Germany
- - -