2006-10-04 20:23:16

by Jiri Kosina

[permalink] [raw]
Subject: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

I am getting kernel panic (NULL pointer dereference) on boot, with kernel
compiled with CONFIG_SND_MPU401_UART=y, on machine which does not have
this piece of hardware.

I have traced the problem down to
sound/drivers/mpu401/mpu401.c:snd_mpu401_probe() returning EINVAL, when
either port or IRQ parameters are not specified.

In such case, the drivers/base/bus.c:bus_attach_device() does not perform
klist_add_tail() call, but rather sets dev->is_registered to 0. This flag
is however not checked by the driver, so later on, when
alsa_card_mpu401_init() is called and platform_device_register_simple()
fails, the following callchain happens, causing NULL pointer dereference:
alsa_card_mpu401_init() -> platform_device_unregister() ->
platform_device_del() -> device_del() -> bus_remove_device() ->
klist_del() -> BOOM (the entry was not added to klist in
bus_attach_device()).

Proper solution is returning ENODEV from the ->probe() routine, which will
be correctly handled then by the rest of the device-driver attaching
subsystem (namely the retval check in bus_attach_device()). The following
patch fixes the problem, please apply.

Patch against current Linus' git tree.

Signed-off-by: Jiri Kosina <[email protected]>

--- a/sound/drivers/mpu401/mpu401.c
+++ b/sound/drivers/mpu401/mpu401.c
@@ -104,11 +104,11 @@ static int __devinit snd_mpu401_probe(st

if (port[dev] == SNDRV_AUTO_PORT) {
snd_printk(KERN_ERR "specify port\n");
- return -EINVAL;
+ return -ENODEV;
}
if (irq[dev] == SNDRV_AUTO_IRQ) {
snd_printk(KERN_ERR "specify or disable IRQ\n");
- return -EINVAL;
+ return -ENODEV;
}
err = snd_mpu401_create(dev, &card);
if (err < 0)

--
Jiri Kosina


2006-10-05 08:07:35

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

On Wed, 4 Oct 2006, Jiri Kosina wrote:

> I am getting kernel panic (NULL pointer dereference) on boot, with kernel
> compiled with CONFIG_SND_MPU401_UART=y, on machine which does not have
> this piece of hardware.
>
> I have traced the problem down to
> sound/drivers/mpu401/mpu401.c:snd_mpu401_probe() returning EINVAL, when
> either port or IRQ parameters are not specified.
>
> In such case, the drivers/base/bus.c:bus_attach_device() does not perform
> klist_add_tail() call, but rather sets dev->is_registered to 0. This flag
> is however not checked by the driver, so later on, when
> alsa_card_mpu401_init() is called and platform_device_register_simple()
> fails, the following callchain happens, causing NULL pointer dereference:
> alsa_card_mpu401_init() -> platform_device_unregister() ->
> platform_device_del() -> device_del() -> bus_remove_device() ->
> klist_del() -> BOOM (the entry was not added to klist in
> bus_attach_device()).
>
> Proper solution is returning ENODEV from the ->probe() routine, which will
> be correctly handled then by the rest of the device-driver attaching
> subsystem (namely the retval check in bus_attach_device()). The following
> patch fixes the problem, please apply.

Unfortunately, I do not think that it's a proper solution. I think that
platform device layer should play more nicely and if probe() fails for
a reason and if platform_device_register_simple() does not set
IS_ERR(), then platform_device_unregister() must be callable to free
all resources.

Also, you've proposed to not fix all returns in snd_mpu401_probe() only
first two.

I would reject this patch and fix drivers/base/bus.c. The problematic
change is in commit f2eaae197f4590c4d96f31b09b0ee9067421a95c and this
patch will probably fix it:

[PATCH] drivers/base - check if device is registered before removal

Without this fix platform_device_unregister() might oops.

Signed-off-by: Jaroslav Kysela <[email protected]>

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 12173d1..daa2390 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -428,8 +428,10 @@ void bus_remove_device(struct device * d
sysfs_remove_link(&dev->kobj, "bus");
sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
device_remove_attrs(dev->bus, dev);
- dev->is_registered = 0;
- klist_del(&dev->knode_bus);
+ if (dev->is_registered) {
+ dev->is_registered = 0;
+ klist_del(&dev->knode_bus);
+ }
pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
device_release_driver(dev);
put_bus(dev->bus);

Jaroslav

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SUSE Labs

2006-10-05 09:58:19

by Jiri Kosina

[permalink] [raw]
Subject: Re: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

On Thu, 5 Oct 2006, Jaroslav Kysela wrote:

> Unfortunately, I do not think that it's a proper solution. I think that
> platform device layer should play more nicely and if probe() fails for a
> reason and if platform_device_register_simple() does not set IS_ERR(),
> then platform_device_unregister() must be callable to free all
> resources.

I agree.

> I would reject this patch and fix drivers/base/bus.c. The problematic
> change is in commit f2eaae197f4590c4d96f31b09b0ee9067421a95c and this
> patch will probably fix it:
> [PATCH] drivers/base - check if device is registered before removal
> Without this fix platform_device_unregister() might oops.
> Signed-off-by: Jaroslav Kysela <[email protected]>
> diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> index 12173d1..daa2390 100644
> --- a/drivers/base/bus.c
> +++ b/drivers/base/bus.c
> @@ -428,8 +428,10 @@ void bus_remove_device(struct device * d
> sysfs_remove_link(&dev->kobj, "bus");
> sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
> device_remove_attrs(dev->bus, dev);
> - dev->is_registered = 0;
> - klist_del(&dev->knode_bus);
> + if (dev->is_registered) {
> + dev->is_registered = 0;
> + klist_del(&dev->knode_bus);
> + }
> pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
> device_release_driver(dev);
> put_bus(dev->bus);

Yes, it (among other things) fixes the panic in MPU401 initialization.

Acked-by: Jiri Kosina <[email protected]>

--
Jiri Kosina

2006-10-05 14:02:29

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

On Thu, 5 Oct 2006, Jaroslav Kysela wrote:

> On Wed, 4 Oct 2006, Jiri Kosina wrote:
>
> > I am getting kernel panic (NULL pointer dereference) on boot, with kernel
> > compiled with CONFIG_SND_MPU401_UART=y, on machine which does not have
> > this piece of hardware.
> >
> > I have traced the problem down to
> > sound/drivers/mpu401/mpu401.c:snd_mpu401_probe() returning EINVAL, when
> > either port or IRQ parameters are not specified.
> >
> > In such case, the drivers/base/bus.c:bus_attach_device() does not perform
> > klist_add_tail() call, but rather sets dev->is_registered to 0. This flag
> > is however not checked by the driver, so later on, when
> > alsa_card_mpu401_init() is called and platform_device_register_simple()
> > fails, the following callchain happens, causing NULL pointer dereference:
> > alsa_card_mpu401_init() -> platform_device_unregister() ->
> > platform_device_del() -> device_del() -> bus_remove_device() ->
> > klist_del() -> BOOM (the entry was not added to klist in
> > bus_attach_device()).
> >
> > Proper solution is returning ENODEV from the ->probe() routine, which will
> > be correctly handled then by the rest of the device-driver attaching
> > subsystem (namely the retval check in bus_attach_device()). The following
> > patch fixes the problem, please apply.
>
> Unfortunately, I do not think that it's a proper solution. I think that
> platform device layer should play more nicely and if probe() fails for
> a reason and if platform_device_register_simple() does not set
> IS_ERR(), then platform_device_unregister() must be callable to free
> all resources.
>
> Also, you've proposed to not fix all returns in snd_mpu401_probe() only
> first two.
>
> I would reject this patch and fix drivers/base/bus.c. The problematic
> change is in commit f2eaae197f4590c4d96f31b09b0ee9067421a95c and this
> patch will probably fix it:

Your patch is right as far as it goes, but it misses part of the problem.
device_add() shouldn't ignore the return value from bus_attach_device().
That's why we ended up trying to unregister a device which never was
properly registered in the first place.

What do you think of this addition to your patch (untested)?

Alan Stern


Index: 18g20/drivers/base/bus.c
===================================================================
--- 18g20.orig/drivers/base/bus.c
+++ 18g20/drivers/base/bus.c
@@ -453,8 +453,10 @@ void bus_remove_device(struct device * d
remove_deprecated_bus_links(dev);
sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
device_remove_attrs(dev->bus, dev);
- dev->is_registered = 0;
- klist_del(&dev->knode_bus);
+ if (dev->is_registered) {
+ dev->is_registered = 0;
+ klist_del(&dev->knode_bus);
+ }
pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
device_release_driver(dev);
put_bus(dev->bus);
Index: 18g20/drivers/base/core.c
===================================================================
--- 18g20.orig/drivers/base/core.c
+++ 18g20/drivers/base/core.c
@@ -485,7 +485,8 @@ int device_add(struct device *dev)
if ((error = bus_add_device(dev)))
goto BusError;
kobject_uevent(&dev->kobj, KOBJ_ADD);
- bus_attach_device(dev);
+ if ((error = bus_attach_device(dev)))
+ goto AttachError;
if (parent)
klist_add_tail(&dev->knode_parent, &parent->klist_children);

@@ -504,6 +505,8 @@ int device_add(struct device *dev)
kfree(class_name);
put_device(dev);
return error;
+ AttachError:
+ bus_remove_device(dev);
BusError:
device_pm_remove(dev);
PMError:

2006-10-05 14:16:07

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

On Thu, 5 Oct 2006, Alan Stern wrote:

> On Thu, 5 Oct 2006, Jaroslav Kysela wrote:
>
> > On Wed, 4 Oct 2006, Jiri Kosina wrote:
> >
> > > I am getting kernel panic (NULL pointer dereference) on boot, with kernel
> > > compiled with CONFIG_SND_MPU401_UART=y, on machine which does not have
> > > this piece of hardware.
> > >
> > > I have traced the problem down to
> > > sound/drivers/mpu401/mpu401.c:snd_mpu401_probe() returning EINVAL, when
> > > either port or IRQ parameters are not specified.
> > >
> > > In such case, the drivers/base/bus.c:bus_attach_device() does not perform
> > > klist_add_tail() call, but rather sets dev->is_registered to 0. This flag
> > > is however not checked by the driver, so later on, when
> > > alsa_card_mpu401_init() is called and platform_device_register_simple()
> > > fails, the following callchain happens, causing NULL pointer dereference:
> > > alsa_card_mpu401_init() -> platform_device_unregister() ->
> > > platform_device_del() -> device_del() -> bus_remove_device() ->
> > > klist_del() -> BOOM (the entry was not added to klist in
> > > bus_attach_device()).
> > >
> > > Proper solution is returning ENODEV from the ->probe() routine, which will
> > > be correctly handled then by the rest of the device-driver attaching
> > > subsystem (namely the retval check in bus_attach_device()). The following
> > > patch fixes the problem, please apply.
> >
> > Unfortunately, I do not think that it's a proper solution. I think that
> > platform device layer should play more nicely and if probe() fails for
> > a reason and if platform_device_register_simple() does not set
> > IS_ERR(), then platform_device_unregister() must be callable to free
> > all resources.
> >
> > Also, you've proposed to not fix all returns in snd_mpu401_probe() only
> > first two.
> >
> > I would reject this patch and fix drivers/base/bus.c. The problematic
> > change is in commit f2eaae197f4590c4d96f31b09b0ee9067421a95c and this
> > patch will probably fix it:
>
> Your patch is right as far as it goes, but it misses part of the problem.
> device_add() shouldn't ignore the return value from bus_attach_device().
> That's why we ended up trying to unregister a device which never was
> properly registered in the first place.
>
> What do you think of this addition to your patch (untested)?

Your addition looks good. Acked from me.

Jaroslav

> Index: 18g20/drivers/base/bus.c
> ===================================================================
> --- 18g20.orig/drivers/base/bus.c
> +++ 18g20/drivers/base/bus.c
> @@ -453,8 +453,10 @@ void bus_remove_device(struct device * d
> remove_deprecated_bus_links(dev);
> sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
> device_remove_attrs(dev->bus, dev);
> - dev->is_registered = 0;
> - klist_del(&dev->knode_bus);
> + if (dev->is_registered) {
> + dev->is_registered = 0;
> + klist_del(&dev->knode_bus);
> + }
> pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
> device_release_driver(dev);
> put_bus(dev->bus);
> Index: 18g20/drivers/base/core.c
> ===================================================================
> --- 18g20.orig/drivers/base/core.c
> +++ 18g20/drivers/base/core.c
> @@ -485,7 +485,8 @@ int device_add(struct device *dev)
> if ((error = bus_add_device(dev)))
> goto BusError;
> kobject_uevent(&dev->kobj, KOBJ_ADD);
> - bus_attach_device(dev);
> + if ((error = bus_attach_device(dev)))
> + goto AttachError;
> if (parent)
> klist_add_tail(&dev->knode_parent, &parent->klist_children);
>
> @@ -504,6 +505,8 @@ int device_add(struct device *dev)
> kfree(class_name);
> put_device(dev);
> return error;
> + AttachError:
> + bus_remove_device(dev);
> BusError:
> device_pm_remove(dev);
> PMError:

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SUSE Labs

2006-10-05 18:48:44

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] ALSA: fix kernel panic in initialization of mpu401 driver

On Thu, Oct 05, 2006 at 04:16:01PM +0200, Jaroslav Kysela wrote:
> On Thu, 5 Oct 2006, Alan Stern wrote:
>
> > On Thu, 5 Oct 2006, Jaroslav Kysela wrote:
> >
> > > On Wed, 4 Oct 2006, Jiri Kosina wrote:
> > >
> > > > I am getting kernel panic (NULL pointer dereference) on boot, with kernel
> > > > compiled with CONFIG_SND_MPU401_UART=y, on machine which does not have
> > > > this piece of hardware.
> > > >
> > > > I have traced the problem down to
> > > > sound/drivers/mpu401/mpu401.c:snd_mpu401_probe() returning EINVAL, when
> > > > either port or IRQ parameters are not specified.
> > > >
> > > > In such case, the drivers/base/bus.c:bus_attach_device() does not perform
> > > > klist_add_tail() call, but rather sets dev->is_registered to 0. This flag
> > > > is however not checked by the driver, so later on, when
> > > > alsa_card_mpu401_init() is called and platform_device_register_simple()
> > > > fails, the following callchain happens, causing NULL pointer dereference:
> > > > alsa_card_mpu401_init() -> platform_device_unregister() ->
> > > > platform_device_del() -> device_del() -> bus_remove_device() ->
> > > > klist_del() -> BOOM (the entry was not added to klist in
> > > > bus_attach_device()).
> > > >
> > > > Proper solution is returning ENODEV from the ->probe() routine, which will
> > > > be correctly handled then by the rest of the device-driver attaching
> > > > subsystem (namely the retval check in bus_attach_device()). The following
> > > > patch fixes the problem, please apply.
> > >
> > > Unfortunately, I do not think that it's a proper solution. I think that
> > > platform device layer should play more nicely and if probe() fails for
> > > a reason and if platform_device_register_simple() does not set
> > > IS_ERR(), then platform_device_unregister() must be callable to free
> > > all resources.
> > >
> > > Also, you've proposed to not fix all returns in snd_mpu401_probe() only
> > > first two.
> > >
> > > I would reject this patch and fix drivers/base/bus.c. The problematic
> > > change is in commit f2eaae197f4590c4d96f31b09b0ee9067421a95c and this
> > > patch will probably fix it:
> >
> > Your patch is right as far as it goes, but it misses part of the problem.
> > device_add() shouldn't ignore the return value from bus_attach_device().
> > That's why we ended up trying to unregister a device which never was
> > properly registered in the first place.
> >
> > What do you think of this addition to your patch (untested)?
>
> Your addition looks good. Acked from me.

Great, Alan, care to resend with a better description and the proper
signed-off-by lines?

thanks,

greg k-h

2006-10-05 21:03:26

by Alan Stern

[permalink] [raw]
Subject: [PATCH] Driver core: Don't ignore error returns from probing

This patch (as797) fixes device_add() in the driver core. It needs to
pay attention when the driver for a new device reports an error.

At the same time, since bus_remove_device() undoes the effects of both
bus_add_device() and bus_attach_device(), it needs to check whether
the bus_attach_device step failed.

Signed-off-by: Alan Stern <[email protected]>

---

On Thu, 5 Oct 2006, Greg KH wrote:

> Great, Alan, care to resend with a better description and the proper
> signed-off-by lines?

Here it is. Mind you, I still haven't tested it. And the way the code
was written originally makes me wonder if the probe error wasn't
deliberately ignored. Although if it was deliberate, there ought to have
been a comment explaining things.

Anyway, the change to bus.c is exactly the same as Jaroslav's patch, and
it is obviously correct.


Index: 18g20/drivers/base/bus.c
===================================================================
--- 18g20.orig/drivers/base/bus.c
+++ 18g20/drivers/base/bus.c
@@ -453,8 +453,10 @@ void bus_remove_device(struct device * d
remove_deprecated_bus_links(dev);
sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
device_remove_attrs(dev->bus, dev);
- dev->is_registered = 0;
- klist_del(&dev->knode_bus);
+ if (dev->is_registered) {
+ dev->is_registered = 0;
+ klist_del(&dev->knode_bus);
+ }
pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
device_release_driver(dev);
put_bus(dev->bus);
Index: 18g20/drivers/base/core.c
===================================================================
--- 18g20.orig/drivers/base/core.c
+++ 18g20/drivers/base/core.c
@@ -485,7 +485,8 @@ int device_add(struct device *dev)
if ((error = bus_add_device(dev)))
goto BusError;
kobject_uevent(&dev->kobj, KOBJ_ADD);
- bus_attach_device(dev);
+ if ((error = bus_attach_device(dev)))
+ goto AttachError;
if (parent)
klist_add_tail(&dev->knode_parent, &parent->klist_children);

@@ -504,6 +505,8 @@ int device_add(struct device *dev)
kfree(class_name);
put_device(dev);
return error;
+ AttachError:
+ bus_remove_device(dev);
BusError:
device_pm_remove(dev);
PMError:


2006-10-06 07:53:06

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore error returns from probing

On Thu, 5 Oct 2006 17:03:24 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> Index: 18g20/drivers/base/bus.c
> ===================================================================
> --- 18g20.orig/drivers/base/bus.c
> +++ 18g20/drivers/base/bus.c
> @@ -453,8 +453,10 @@ void bus_remove_device(struct device * d
> remove_deprecated_bus_links(dev);
> sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
> device_remove_attrs(dev->bus, dev);
> - dev->is_registered = 0;
> - klist_del(&dev->knode_bus);
> + if (dev->is_registered) {
> + dev->is_registered = 0;
> + klist_del(&dev->knode_bus);
> + }
> pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
> device_release_driver(dev);
> put_bus(dev->bus);
> Index: 18g20/drivers/base/core.c
> ===================================================================
> --- 18g20.orig/drivers/base/core.c
> +++ 18g20/drivers/base/core.c
> @@ -485,7 +485,8 @@ int device_add(struct device *dev)
> if ((error = bus_add_device(dev)))
> goto BusError;
> kobject_uevent(&dev->kobj, KOBJ_ADD);
> - bus_attach_device(dev);
> + if ((error = bus_attach_device(dev)))
> + goto AttachError;
> if (parent)
> klist_add_tail(&dev->knode_parent, &parent->klist_children);
>
> @@ -504,6 +505,8 @@ int device_add(struct device *dev)
> kfree(class_name);
> put_device(dev);
> return error;
> + AttachError:
> + bus_remove_device(dev);
> BusError:
> device_pm_remove(dev);
> PMError:

Hm, I don't think we should call device_release_driver if
bus_attach_device failed (and I think calling bus_remove_device if
bus_attach_device failed is unintuitive). I did a patch that added a
function which undid just the things bus_add_device did (here:
http://marc.theaimsgroup.com/?l=linux-kernel&m=115816560424389&w=2),
which unfortunately got lost somewhere... (I'll rebase and resend.)

--
Cornelia Huck
Linux for zSeries Developer
Tel.: +49-7031-16-4837, Mail: [email protected]

2006-10-06 07:56:48

by Cornelia Huck

[permalink] [raw]
Subject: [PATCH] driver core: bus_attach_device() retval check

From: Cornelia Huck <[email protected]>

Check for return value of bus_attach_device() in device_add(). Add a
function bus_delete_device() that undos the effects of bus_add_device().

Signed-off-by: Cornelia Huck <[email protected]>

---
drivers/base/base.h | 1 +
drivers/base/bus.c | 21 ++++++++++++++++++++-
drivers/base/core.c | 6 +++++-
3 files changed, 26 insertions(+), 2 deletions(-)

--- linux-2.6-CH.orig/drivers/base/base.h
+++ linux-2.6-CH/drivers/base/base.h
@@ -17,6 +17,7 @@ extern int attribute_container_init(void

extern int bus_add_device(struct device * dev);
extern int bus_attach_device(struct device * dev);
+extern void bus_delete_device(struct device * dev);
extern void bus_remove_device(struct device * dev);
extern struct bus_type *get_bus(struct bus_type * bus);
extern void put_bus(struct bus_type * bus);
--- linux-2.6-CH.orig/drivers/base/bus.c
+++ linux-2.6-CH/drivers/base/bus.c
@@ -360,7 +360,7 @@ static void device_remove_attrs(struct b
* bus_add_device - add device to bus
* @dev: device being added
*
- * - Add the device to its bus's list of devices.
+ * - Add attributes.
* - Create link to device's bus.
*/
int bus_add_device(struct device * dev)
@@ -424,6 +424,25 @@ int bus_attach_device(struct device * de
}

/**
+ * bus_delete_device - undo bus_add_device
+ * @dev: device being deleted
+ *
+ * - Remove symlink from bus's directory.
+ * - Remove attributes.
+ * - Drop reference taken in bus_add_device().
+ */
+void bus_delete_device(struct device * dev)
+{
+ if (dev->bus) {
+ sysfs_remove_link(&dev->kobj, "subsystem");
+ sysfs_remove_link(&dev->kobj, "bus");
+ sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
+ device_remove_attrs(dev->bus, dev);
+ put_bus(dev->bus);
+ }
+}
+
+/**
* bus_remove_device - remove device from bus
* @dev: device to be removed
*
--- linux-2.6-CH.orig/drivers/base/core.c
+++ linux-2.6-CH/drivers/base/core.c
@@ -479,7 +479,9 @@ int device_add(struct device *dev)
if ((error = bus_add_device(dev)))
goto BusError;
kobject_uevent(&dev->kobj, KOBJ_ADD);
- bus_attach_device(dev);
+ error = bus_attach_device(dev);
+ if (error)
+ goto attachError;
if (parent)
klist_add_tail(&dev->knode_parent, &parent->klist_children);

@@ -498,6 +500,8 @@ int device_add(struct device *dev)
kfree(class_name);
put_device(dev);
return error;
+ attachError:
+ bus_delete_device(dev);
BusError:
device_pm_remove(dev);
PMError:

2006-10-06 09:41:10

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [Alsa-devel] [PATCH] Driver core: Don't ignore error returns from probing

On Fri, 6 Oct 2006, Cornelia Huck wrote:

> On Thu, 5 Oct 2006 17:03:24 -0400 (EDT),
> Alan Stern <[email protected]> wrote:
>
> > Index: 18g20/drivers/base/bus.c
> > ===================================================================
> > --- 18g20.orig/drivers/base/bus.c
> > +++ 18g20/drivers/base/bus.c
> > @@ -453,8 +453,10 @@ void bus_remove_device(struct device * d
> > remove_deprecated_bus_links(dev);
> > sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
> > device_remove_attrs(dev->bus, dev);
> > - dev->is_registered = 0;
> > - klist_del(&dev->knode_bus);
> > + if (dev->is_registered) {
> > + dev->is_registered = 0;
> > + klist_del(&dev->knode_bus);
> > + }
> > pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
> > device_release_driver(dev);
> > put_bus(dev->bus);
> > Index: 18g20/drivers/base/core.c
> > ===================================================================
> > --- 18g20.orig/drivers/base/core.c
> > +++ 18g20/drivers/base/core.c
> > @@ -485,7 +485,8 @@ int device_add(struct device *dev)
> > if ((error = bus_add_device(dev)))
> > goto BusError;
> > kobject_uevent(&dev->kobj, KOBJ_ADD);
> > - bus_attach_device(dev);
> > + if ((error = bus_attach_device(dev)))
> > + goto AttachError;
> > if (parent)
> > klist_add_tail(&dev->knode_parent, &parent->klist_children);
> >
> > @@ -504,6 +505,8 @@ int device_add(struct device *dev)
> > kfree(class_name);
> > put_device(dev);
> > return error;
> > + AttachError:
> > + bus_remove_device(dev);
> > BusError:
> > device_pm_remove(dev);
> > PMError:
>
> Hm, I don't think we should call device_release_driver if
> bus_attach_device failed (and I think calling bus_remove_device if
> bus_attach_device failed is unintuitive). I did a patch that added a
> function which undid just the things bus_add_device did (here:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=115816560424389&w=2),
> which unfortunately got lost somewhere... (I'll rebase and resend.)

Yes, but it might be better to check dev->is_registered flag in
bus_remove_device() before device_release_driver() call to save some code,
rather than reuse most of code in bus_delete_device().

Jaroslav

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SUSE Labs

2006-10-06 11:14:13

by Cornelia Huck

[permalink] [raw]
Subject: Re: [Alsa-devel] [PATCH] Driver core: Don't ignore error returns from probing

On Fri, 6 Oct 2006 11:41:05 +0200 (CEST),
Jaroslav Kysela <[email protected]> wrote:

> > Hm, I don't think we should call device_release_driver if
> > bus_attach_device failed (and I think calling bus_remove_device if
> > bus_attach_device failed is unintuitive). I did a patch that added a
> > function which undid just the things bus_add_device did (here:
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=115816560424389&w=2),
> > which unfortunately got lost somewhere... (I'll rebase and resend.)
>
> Yes, but it might be better to check dev->is_registered flag in
> bus_remove_device() before device_release_driver() call to save some code,
> rather than reuse most of code in bus_delete_device().

If we undid things (symlinks et al.) in the order we added them, we can
factor out bus_delete_device() from bus_remove_device() and avoid both
code duplication and calling bus_remove_device() if bus_attach_device()
failed. Something like the patch below (untested).


--- linux-2.6-CH.orig/drivers/base/base.h
+++ linux-2.6-CH/drivers/base/base.h
@@ -17,6 +17,7 @@ extern int attribute_container_init(void

extern int bus_add_device(struct device * dev);
extern int bus_attach_device(struct device * dev);
+extern void bus_delete_device(struct device * dev);
extern void bus_remove_device(struct device * dev);
extern struct bus_type *get_bus(struct bus_type * bus);
extern void put_bus(struct bus_type * bus);
--- linux-2.6-CH.orig/drivers/base/bus.c
+++ linux-2.6-CH/drivers/base/bus.c
@@ -360,7 +360,7 @@ static void device_remove_attrs(struct b
* bus_add_device - add device to bus
* @dev: device being added
*
- * - Add the device to its bus's list of devices.
+ * - Add attributes.
* - Create link to device's bus.
*/
int bus_add_device(struct device * dev)
@@ -424,29 +424,44 @@ int bus_attach_device(struct device * de
}

/**
- * bus_remove_device - remove device from bus
- * @dev: device to be removed
+ * bus_delete_device - undo bus_add_device
+ * @dev: device being deleted
*
- * - Remove symlink from bus's directory.
- * - Delete device from bus's list.
- * - Detach from its driver.
- * - Drop reference taken in bus_add_device().
+ * - Remove symlink from bus's directory.
+ * - Remove attributes.
+ * - Drop reference taken in bus_add_device().
*/
-void bus_remove_device(struct device * dev)
+void bus_delete_device(struct device * dev)
{
if (dev->bus) {
sysfs_remove_link(&dev->kobj, "subsystem");
sysfs_remove_link(&dev->kobj, "bus");
sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
device_remove_attrs(dev->bus, dev);
- dev->is_registered = 0;
- klist_del(&dev->knode_bus);
- pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
- device_release_driver(dev);
put_bus(dev->bus);
}
}

+/**
+ * bus_remove_device - remove device from bus
+ * @dev: device to be removed
+ *
+ * - Remove symlink from bus's directory.
+ * - Delete device from bus's list.
+ * - Detach from its driver.
+ * - Drop reference taken in bus_add_device().
+ */
+void bus_remove_device(struct device * dev)
+{
+ if (!dev->bus)
+ return;
+ dev->is_registered = 0;
+ klist_del(&dev->knode_bus);
+ pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
+ device_release_driver(dev);
+ bus_delete_device(dev);
+}
+
static int driver_add_attrs(struct bus_type * bus, struct device_driver * drv)
{
int error = 0;
--- linux-2.6-CH.orig/drivers/base/core.c
+++ linux-2.6-CH/drivers/base/core.c
@@ -479,7 +479,9 @@ int device_add(struct device *dev)
if ((error = bus_add_device(dev)))
goto BusError;
kobject_uevent(&dev->kobj, KOBJ_ADD);
- bus_attach_device(dev);
+ error = bus_attach_device(dev);
+ if (error)
+ goto attachError;
if (parent)
klist_add_tail(&dev->knode_parent, &parent->klist_children);

@@ -498,6 +500,8 @@ int device_add(struct device *dev)
kfree(class_name);
put_device(dev);
return error;
+ attachError:
+ bus_delete_device(dev);
BusError:
device_pm_remove(dev);
PMError:

2006-10-06 11:46:55

by Jaroslav Kysela

[permalink] [raw]
Subject: Re: [Alsa-devel] [PATCH] Driver core: Don't ignore error returns from probing

On Fri, 6 Oct 2006, Cornelia Huck wrote:

> On Fri, 6 Oct 2006 11:41:05 +0200 (CEST),
> Jaroslav Kysela <[email protected]> wrote:
>
> > > Hm, I don't think we should call device_release_driver if
> > > bus_attach_device failed (and I think calling bus_remove_device if
> > > bus_attach_device failed is unintuitive). I did a patch that added a
> > > function which undid just the things bus_add_device did (here:
> > > http://marc.theaimsgroup.com/?l=linux-kernel&m=115816560424389&w=2),
> > > which unfortunately got lost somewhere... (I'll rebase and resend.)
> >
> > Yes, but it might be better to check dev->is_registered flag in
> > bus_remove_device() before device_release_driver() call to save some code,
> > rather than reuse most of code in bus_delete_device().
>
> If we undid things (symlinks et al.) in the order we added them, we can
> factor out bus_delete_device() from bus_remove_device() and avoid both
> code duplication and calling bus_remove_device() if bus_attach_device()
> failed. Something like the patch below (untested).

It looks better, but I think that having only one function with if
(is_registered) saves a few bytes of instruction memory. Anyway, I do not
feel myself to judge what's the best.

Jaroslav

-----
Jaroslav Kysela <[email protected]>
Linux Kernel Sound Maintainer
ALSA Project, SUSE Labs

2006-10-06 18:12:51

by Alan Stern

[permalink] [raw]
Subject: Re: [Alsa-devel] [PATCH] Driver core: Don't ignore error returns from probing

On Fri, 6 Oct 2006, Cornelia Huck wrote:

> On Fri, 6 Oct 2006 11:41:05 +0200 (CEST),
> Jaroslav Kysela <[email protected]> wrote:
>
> > > Hm, I don't think we should call device_release_driver if
> > > bus_attach_device failed (and I think calling bus_remove_device if
> > > bus_attach_device failed is unintuitive). I did a patch that added a
> > > function which undid just the things bus_add_device did (here:
> > > http://marc.theaimsgroup.com/?l=linux-kernel&m=115816560424389&w=2),
> > > which unfortunately got lost somewhere... (I'll rebase and resend.)

I'm still not sure why bus_attach_device() was split off from
bus_add_device() in the first place. Was it just so that the
kobject_uevent() call could go in between?

Is there any reason they couldn't be recombined into a single function?

> > Yes, but it might be better to check dev->is_registered flag in
> > bus_remove_device() before device_release_driver() call to save some code,
> > rather than reuse most of code in bus_delete_device().

Agreed; I don't like the duplication of code. It's wasteful and
error-prone (somebody might change one routine and not the other).

> If we undid things (symlinks et al.) in the order we added them, we can
> factor out bus_delete_device() from bus_remove_device() and avoid both
> code duplication and calling bus_remove_device() if bus_attach_device()
> failed. Something like the patch below (untested).

This looks okay, but it would be better if bus_remove_device() did not
directly call bus_delete_device(). Just add an extra call inside
device_del(), so that everything remains symmetrical.

While I'm harping on style issues, you should also capitalize AttachError
so that it matches the form of the other statement labels nearby. And in
bus_remove_device() you should put all the code inside the "if" block
instead of returning when dev->bus isn't set, just as the neighboring
subroutines do.

Alan Stern

2006-10-09 11:09:43

by Cornelia Huck

[permalink] [raw]
Subject: Re: [Alsa-devel] [PATCH] Driver core: Don't ignore error returns from probing

On Fri, 6 Oct 2006 14:12:49 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> I'm still not sure why bus_attach_device() was split off from
> bus_add_device() in the first place. Was it just so that the
> kobject_uevent() call could go in between?

I think yes. This was added in
http://marc.theaimsgroup.com/?l=linux-kernel&m=115092084915731&w=2

> This looks okay, but it would be better if bus_remove_device() did not
> directly call bus_delete_device(). Just add an extra call inside
> device_del(), so that everything remains symmetrical.
>
> While I'm harping on style issues, you should also capitalize AttachError
> so that it matches the form of the other statement labels nearby. And in
> bus_remove_device() you should put all the code inside the "if" block
> instead of returning when dev->bus isn't set, just as the neighboring
> subroutines do.

OK, new patch on the way.

--
Cornelia Huck
Linux for zSeries Developer
Tel.: +49-7031-16-4837, Mail: [email protected]

2006-10-09 11:14:04

by Cornelia Huck

[permalink] [raw]
Subject: [PATCH] Driver core: Don't ignore bus_attach_device() retval

From: Cornelia Huck <[email protected]>

Check for return value of bus_attach_device() in device_add(). Add a
function bus_delete_device() that undos the effects of bus_add_device().
bus_remove_device() now undos the effects of bus_attach_device() only.
device_del() now calls bus_remove_device(), kobject_uevent(),
bus_delete_device() which makes it symmetric to the call sequence in
device_add().

Signed-off-by: Cornelia Huck <[email protected]>

---
drivers/base/base.h | 1 +
drivers/base/bus.c | 31 ++++++++++++++++++++++---------
drivers/base/core.c | 7 ++++++-
3 files changed, 29 insertions(+), 10 deletions(-)

--- linux.orig/drivers/base/base.h
+++ linux/drivers/base/base.h
@@ -17,6 +17,7 @@ extern int attribute_container_init(void

extern int bus_add_device(struct device * dev);
extern int bus_attach_device(struct device * dev);
+extern void bus_delete_device(struct device * dev);
extern void bus_remove_device(struct device * dev);
extern struct bus_type *get_bus(struct bus_type * bus);
extern void put_bus(struct bus_type * bus);
--- linux.orig/drivers/base/bus.c
+++ linux/drivers/base/bus.c
@@ -360,7 +360,7 @@ static void device_remove_attrs(struct b
* bus_add_device - add device to bus
* @dev: device being added
*
- * - Add the device to its bus's list of devices.
+ * - Add attributes.
* - Create link to device's bus.
*/
int bus_add_device(struct device * dev)
@@ -424,26 +424,39 @@ int bus_attach_device(struct device * de
}

/**
+ * bus_delete_device - undo bus_add_device
+ * @dev: device being deleted
+ *
+ * - Remove symlink from bus's directory.
+ * - Remove attributes.
+ * - Drop reference taken in bus_add_device().
+ */
+void bus_delete_device(struct device * dev)
+{
+ if (dev->bus) {
+ sysfs_remove_link(&dev->kobj, "subsystem");
+ sysfs_remove_link(&dev->kobj, "bus");
+ sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
+ device_remove_attrs(dev->bus, dev);
+ put_bus(dev->bus);
+ }
+}
+
+/**
* bus_remove_device - remove device from bus
* @dev: device to be removed
*
- * - Remove symlink from bus's directory.
* - Delete device from bus's list.
* - Detach from its driver.
- * - Drop reference taken in bus_add_device().
*/
void bus_remove_device(struct device * dev)
{
if (dev->bus) {
- sysfs_remove_link(&dev->kobj, "subsystem");
- sysfs_remove_link(&dev->kobj, "bus");
- sysfs_remove_link(&dev->bus->devices.kobj, dev->bus_id);
- device_remove_attrs(dev->bus, dev);
dev->is_registered = 0;
klist_del(&dev->knode_bus);
- pr_debug("bus %s: remove device %s\n", dev->bus->name, dev->bus_id);
+ pr_debug("bus %s: remove device %s\n", dev->bus->name,
+ dev->bus_id);
device_release_driver(dev);
- put_bus(dev->bus);
}
}

--- linux.orig/drivers/base/core.c
+++ linux/drivers/base/core.c
@@ -479,7 +479,9 @@ int device_add(struct device *dev)
if ((error = bus_add_device(dev)))
goto BusError;
kobject_uevent(&dev->kobj, KOBJ_ADD);
- bus_attach_device(dev);
+ error = bus_attach_device(dev);
+ if (error)
+ goto AttachError;
if (parent)
klist_add_tail(&dev->knode_parent, &parent->klist_children);

@@ -498,6 +500,8 @@ int device_add(struct device *dev)
kfree(class_name);
put_device(dev);
return error;
+ AttachError:
+ bus_delete_device(dev);
BusError:
device_pm_remove(dev);
PMError:
@@ -620,6 +624,7 @@ void device_del(struct device * dev)
bus_remove_device(dev);
device_pm_remove(dev);
kobject_uevent(&dev->kobj, KOBJ_REMOVE);
+ bus_delete_device(dev);
kobject_del(&dev->kobj);
if (parent)
put_device(parent);

2006-10-09 14:10:51

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Mon, 9 Oct 2006, Cornelia Huck wrote:

> From: Cornelia Huck <[email protected]>
>
> Check for return value of bus_attach_device() in device_add(). Add a
> function bus_delete_device() that undos the effects of bus_add_device().
> bus_remove_device() now undos the effects of bus_attach_device() only.
> device_del() now calls bus_remove_device(), kobject_uevent(),
> bus_delete_device() which makes it symmetric to the call sequence in
> device_add().
>
> Signed-off-by: Cornelia Huck <[email protected]>

This looks good to me.

Alan Stern

2006-10-11 14:49:38

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Mon, 9 Oct 2006, Cornelia Huck wrote:

> From: Cornelia Huck <[email protected]>
>
> Check for return value of bus_attach_device() in device_add(). Add a
> function bus_delete_device() that undos the effects of bus_add_device().
> bus_remove_device() now undos the effects of bus_attach_device() only.
> device_del() now calls bus_remove_device(), kobject_uevent(),
> bus_delete_device() which makes it symmetric to the call sequence in
> device_add().

You know, I'm not so sure device registration should fail when
bus_attach_device() returns an error.

After all, the device really is there even if it's not working properly.
In the Windows device manager it would show up with a big red X through
it, but it _would_ show up.

Furthermore there are subtle problems that can arise. In effect, the
device is registered for a brief time (while the driver is probed) and
then unregistered without giving the bus subsystem a chance to prepare for
the removal. With USB this can lead to problems; if the driver called
usb_set_interface() then child devices would be created below the one
being probed -- and they would never get removed.

Has this question been raised before? Is there any reason not to
register a device even when probing fails?

In fact, we might want to separate driver probing from device_add()
entirely. That is, make them available as two separate function calls.
That way the subsystem driver will have a chance to create attribute files
before a uevent is generated and a driver is loaded. (That should help
udev to work better.) This would require a larger change, though --
probably requiring an alternate version of device_add().

Alan Stern


2006-10-12 09:30:31

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Wed, 11 Oct 2006 10:49:36 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> You know, I'm not so sure device registration should fail when
> bus_attach_device() returns an error.

Hm, let's see why bus_attach_device() might fail:

* device_bind_driver() failed to create some symlinks. We may
consider not to fail in this case, since sysfs_remove_link() is fine
even for non-existing links.

* probing failed for one possible driver with something other than
-ENODEV or -ENXIO. Not sure if we really should abort in this case.
We'd just end up with an unbound device, and a driver returning (for
example) -ENOMEM for probing may just be a really dumb driver trying to
allocate an insane amount of memory (and the next driver might just be
fine).

> Furthermore there are subtle problems that can arise. In effect, the
> device is registered for a brief time (while the driver is probed) and
> then unregistered without giving the bus subsystem a chance to prepare for
> the removal. With USB this can lead to problems; if the driver called
> usb_set_interface() then child devices would be created below the one
> being probed -- and they would never get removed.

One way to fix this would be to make device_bind_driver() always
succeed (even without symlinks), the other to call the ->remove
function if device_bind_driver() fails (assuming that the ->remove
method should undo the stuff done in ->probe).

> In fact, we might want to separate driver probing from device_add()
> entirely. That is, make them available as two separate function calls.
> That way the subsystem driver will have a chance to create attribute files
> before a uevent is generated and a driver is loaded. (That should help
> udev to work better.) This would require a larger change, though --
> probably requiring an alternate version of device_add().

Shouldn't subsystems that need attributes early just use dev->groups,
class->dev_attrs or bus->dev_attrs? These attribute groups are added
before the uevent is generated.

--
Cornelia Huck
Linux for zSeries Developer
Tel.: +49-7031-16-4837, Mail: [email protected]

2006-10-12 16:00:07

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Thu, 12 Oct 2006, Cornelia Huck wrote:

> On Wed, 11 Oct 2006 10:49:36 -0400 (EDT),
> Alan Stern <[email protected]> wrote:
>
> > You know, I'm not so sure device registration should fail when
> > bus_attach_device() returns an error.
>
> Hm, let's see why bus_attach_device() might fail:
>
> * device_bind_driver() failed to create some symlinks. We may
> consider not to fail in this case, since sysfs_remove_link() is fine
> even for non-existing links.

It would be okay to fail in this case, because the driver would not have
been probed at all.

> * probing failed for one possible driver with something other than
> -ENODEV or -ENXIO. Not sure if we really should abort in this case.
> We'd just end up with an unbound device, and a driver returning (for
> example) -ENOMEM for probing may just be a really dumb driver trying to
> allocate an insane amount of memory (and the next driver might just be
> fine).

Yes, this is exactly what I meant. Having an unbound device is okay.

Note also that when multithreaded probing is used, there is no way for
driver_probe_device() to return an error from really_probe(). We may as
well act the same way when multithreaded probing isn't used.

I haven't looked at the driver core much since multithreaded probing was
added. The multithreading part has a bad locking bug: the new thread
doesn't acquire the necessary semaphores. Also it probes multiple drivers
for the same device in parallel, which seems wrong. Since multiple probes
can't run concurrently there's no reason to have a separate thread for
each one. There should be only one new thread per device.


> > Furthermore there are subtle problems that can arise. In effect, the
> > device is registered for a brief time (while the driver is probed) and
> > then unregistered without giving the bus subsystem a chance to prepare for
> > the removal. With USB this can lead to problems; if the driver called
> > usb_set_interface() then child devices would be created below the one
> > being probed -- and they would never get removed.
>
> One way to fix this would be to make device_bind_driver() always
> succeed (even without symlinks),

Hmm... If device_bind_driver() fails -- because of the symlinks -- then
the device is still on the driver's klist, because the klist_add_tail() in
driver_bound() never gets undone. Another bug. Clearly
device_bind_driver() should call driver_sysfs_add() before driver_bound(),
not after. Or else it should never fail.

> the other to call the ->remove
> function if device_bind_driver() fails (assuming that the ->remove
> method should undo the stuff done in ->probe).

It's a lot safer and easier just to switch the order of the calls in
device_bind_driver().

Note that it's also possible for device_attach() to fail through the
__device_attach() -> driver_probe_device() -> really_probe() pathway. I
think in all cases, bus_attach_device() really should ignore the return
code from device_attach().


> > In fact, we might want to separate driver probing from device_add()
> > entirely. That is, make them available as two separate function calls.
> > That way the subsystem driver will have a chance to create attribute files
> > before a uevent is generated and a driver is loaded. (That should help
> > udev to work better.) This would require a larger change, though --
> > probably requiring an alternate version of device_add().
>
> Shouldn't subsystems that need attributes early just use dev->groups,
> class->dev_attrs or bus->dev_attrs? These attribute groups are added
> before the uevent is generated.

Yes, you're right. Forget about this part.

Alan Stern

2006-10-12 17:17:01

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Thu, 12 Oct 2006 11:59:45 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> > * device_bind_driver() failed to create some symlinks. We may
> > consider not to fail in this case, since sysfs_remove_link() is fine
> > even for non-existing links.
>
> It would be okay to fail in this case, because the driver would not have
> been probed at all.

In really_probe(), it is called after ->probe has been called (though
the code does nothing but complain in the failure case).

> > * probing failed for one possible driver with something other than
> > -ENODEV or -ENXIO. Not sure if we really should abort in this case.
> > We'd just end up with an unbound device, and a driver returning (for
> > example) -ENOMEM for probing may just be a really dumb driver trying to
> > allocate an insane amount of memory (and the next driver might just be
> > fine).
>
> Yes, this is exactly what I meant. Having an unbound device is okay.

Maybe ignoring all probe errors would be best here? (Calling
device_bind_driver() only on success, of course.)

> I haven't looked at the driver core much since multithreaded probing was
> added. The multithreading part has a bad locking bug: the new thread
> doesn't acquire the necessary semaphores. Also it probes multiple drivers
> for the same device in parallel, which seems wrong. Since multiple probes
> can't run concurrently there's no reason to have a separate thread for
> each one. There should be only one new thread per device.

Currently, it is the device driver which specifies whether multithreaded
probe should be done. Maybe this should rather be specified per
subsystem? Having several bus_for_each_drv(..., dev, __device_attach)
run in parallel makes more sense to me than the current approach.

> > One way to fix this would be to make device_bind_driver() always
> > succeed (even without symlinks),
>
> Hmm... If device_bind_driver() fails -- because of the symlinks -- then
> the device is still on the driver's klist, because the klist_add_tail() in
> driver_bound() never gets undone. Another bug.

There's already been a fix:
http://marc.theaimsgroup.com/?l=linux-kernel&m=116062971226779&w=2

> Clearly
> device_bind_driver() should call driver_sysfs_add() before driver_bound(),
> not after. Or else it should never fail.

Or that. I'm currently a bit in favour of ignoring symlink errors.

> It's a lot safer and easier just to switch the order of the calls in
> device_bind_driver().

?? Did you mean really_probe() here?

> I
> think in all cases, bus_attach_device() really should ignore the return
> code from device_attach().

This would be the only sensible approach if we had one probing thread
per device. And device_bind_driver() should probably always succeed.

--
Cornelia Huck
Linux for zSeries Developer
Tel.: +49-7031-16-4837, Mail: [email protected]

2006-10-12 21:41:50

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Thu, 12 Oct 2006, Cornelia Huck wrote:

> On Thu, 12 Oct 2006 11:59:45 -0400 (EDT),
> Alan Stern <[email protected]> wrote:
>
> > > * device_bind_driver() failed to create some symlinks. We may
> > > consider not to fail in this case, since sysfs_remove_link() is fine
> > > even for non-existing links.
> >
> > It would be okay to fail in this case, because the driver would not have
> > been probed at all.
>
> In really_probe(), it is called after ->probe has been called (though
> the code does nothing but complain in the failure case).

?? In really_probe(), driver_sysfs_add() is called _before_ ->probe. So
it's okay to fail there. And in device_bind_driver(), which is what you
were talking about above, ->probe doesn't get called anywhere.

At least, in my copy of Greg KH's development tree that's what happens.


> > > * probing failed for one possible driver with something other than
> > > -ENODEV or -ENXIO. Not sure if we really should abort in this case.
> > > We'd just end up with an unbound device, and a driver returning (for
> > > example) -ENOMEM for probing may just be a really dumb driver trying to
> > > allocate an insane amount of memory (and the next driver might just be
> > > fine).
> >
> > Yes, this is exactly what I meant. Having an unbound device is okay.
>
> Maybe ignoring all probe errors would be best here? (Calling
> device_bind_driver() only on success, of course.)

I agree.


> > I haven't looked at the driver core much since multithreaded probing was
> > added. The multithreading part has a bad locking bug: the new thread
> > doesn't acquire the necessary semaphores. Also it probes multiple drivers
> > for the same device in parallel, which seems wrong. Since multiple probes
> > can't run concurrently there's no reason to have a separate thread for
> > each one. There should be only one new thread per device.
>
> Currently, it is the device driver which specifies whether multithreaded
> probe should be done. Maybe this should rather be specified per
> subsystem? Having several bus_for_each_drv(..., dev, __device_attach)
> run in parallel makes more sense to me than the current approach.

Doing it per subsystem sounds right. It looks like the multithread
probing code was added without sufficient thought beforehand.


> > > One way to fix this would be to make device_bind_driver() always
> > > succeed (even without symlinks),
> >
> > Hmm... If device_bind_driver() fails -- because of the symlinks -- then
> > the device is still on the driver's klist, because the klist_add_tail() in
> > driver_bound() never gets undone. Another bug.
>
> There's already been a fix:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=116062971226779&w=2

Oh, okay. (Hmmm, that message was posted earlier today...)

> > Clearly
> > device_bind_driver() should call driver_sysfs_add() before driver_bound(),
> > not after. Or else it should never fail.
>
> Or that. I'm currently a bit in favour of ignoring symlink errors.

I don't care too much one way or another. Although missing symlinks might
cause problems for certain user programs.

> > It's a lot safer and easier just to switch the order of the calls in
> > device_bind_driver().
>
> ?? Did you mean really_probe() here?

No, I meant device_bind_driver(). Try to create the symlinks first, and
if that succeeds (or if you don't care when it fails) then call
driver_bound(). This confusion may be a result of looking at different
source trees.

> > I
> > think in all cases, bus_attach_device() really should ignore the return
> > code from device_attach().
>
> This would be the only sensible approach if we had one probing thread
> per device. And device_bind_driver() should probably always succeed.

Other people may disagree about device_bind_driver(). Best to let it fail
for now and fix these other problems first. It can always be changed
later.

Do you want to write another patch?

Alan Stern

2006-10-13 06:36:54

by Cornelia Huck

[permalink] [raw]
Subject: Re: [PATCH] Driver core: Don't ignore bus_attach_device() retval

On Thu, 12 Oct 2006 17:41:46 -0400 (EDT),
Alan Stern <[email protected]> wrote:

> I don't care too much one way or another. Although missing symlinks might
> cause problems for certain user programs.

Hm, yes, that might happen. I'll give it some more thought.

> No, I meant device_bind_driver(). Try to create the symlinks first, and
> if that succeeds (or if you don't care when it fails) then call
> driver_bound(). This confusion may be a result of looking at different
> source trees.

Ah, ok, I was looking at current git.

> Do you want to write another patch?

I'll see what I can come up with.

--
Cornelia Huck
Linux for zSeries Developer
Tel.: +49-7031-16-4837, Mail: [email protected]