2007-02-16 09:55:25

by Mike Galbraith

[permalink] [raw]
Subject: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

Greetings,

Per $subject, git.yesterday hangs hard on boot here. A git bisect
fingered the commit below, which I verified via git bisect reset; git
revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
boots fine. (well, I hope I verified... i'm git-ignorant)

commit 725522b5453dd680412f2b6463a988e4fd148757
Author: Greg Kroah-Hartman <[email protected]>
Date: Mon Jan 15 11:50:02 2007 -0800

PCI: add the sysfs driver name to all modules

This adds the module name to all PCI drivers, if they are built into the
kernel or not. It will show up in /sys/modules/MODULE_NAME/drivers/

It also fixes up the IDE core, which was calling __pci_register_driver()
directly.

Cc: Kay Sievers <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

:040000 040000 a97900b3284ece0d5f1a3eed372f98c5cb4b10d3 4d1f3045dc379f0eb7064fd05dd311625c090ef3 M drivers
:040000 040000 434c6fa20bd7f9acadb2c6d4a8dfe19a05fe8ea2 9f8ef888fe4050d8cab7333955a6eafb0d2f0fba M include


Kernel messages:
[ 0.781850] ipmi message handler version 39.1
[ 0.786438] ipmi device interface
[ 0.789973] IPMI System Interface driver.
<hangs here on bad kernel>
[ 0.805356] ipmi_si: Unable to find any System Interface(s)
[ 0.811224] IPMI Watchdog: driver initialized

gzipped up config attached.

-Mike


Attachments:
config.gz (13.61 kB)

2007-02-16 22:37:28

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> Greetings,
>
> Per $subject, git.yesterday hangs hard on boot here. A git bisect
> fingered the commit below, which I verified via git bisect reset; git
> revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> boots fine. (well, I hope I verified... i'm git-ignorant)

If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?

If not, can you press alt-sysrq-T and send us the list of the tasks so
that we can try to figure out what is hanging here?

thanks,

greg k-h

2007-02-17 01:38:21

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > Greetings,
> >
> > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > fingered the commit below, which I verified via git bisect reset; git
> > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > boots fine. (well, I hope I verified... i'm git-ignorant)
>
> If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?

It's already set.

> If not, can you press alt-sysrq-T and send us the list of the tasks so
> that we can try to figure out what is hanging here?

Box is ding-dong-dead.

I'll fiddle with it. (pretty darn innocuous looking change...)

-Mike

2007-02-17 01:52:10

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > > Greetings,
> > >
> > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > > fingered the commit below, which I verified via git bisect reset; git
> > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > > boots fine. (well, I hope I verified... i'm git-ignorant)
> >
> > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?
>
> It's already set.

It's not set in the config file you sent to me and the list :)

thanks,

greg k-h

2007-02-17 02:21:41

by Markus Rechberger

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On 2/17/07, Greg KH <[email protected]> wrote:
> On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> > On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> > > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > > > Greetings,
> > > >
> > > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > > > fingered the commit below, which I verified via git bisect reset; git
> > > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > > > boots fine. (well, I hope I verified... i'm git-ignorant)
> > >
> > > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?
> >
> > It's already set.
>
> It's not set in the config file you sent to me and the list :)
>

I'm having a hard lockup to when I fire up xmms in X (maybe some other
apps too) I'm bisecting at the moment.
CONFIG_SYSFS_DEPRECIATED is set to Y here too, so I might have another
problem here.. let's see what bisecting will show up..

Markus

2007-02-17 03:04:55

by Markus Rechberger

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On 2/17/07, Markus Rechberger <[email protected]> wrote:
> On 2/17/07, Greg KH <[email protected]> wrote:
> > On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> > > On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> > > > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > > > > Greetings,
> > > > >
> > > > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > > > > fingered the commit below, which I verified via git bisect reset;
> git
> > > > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > > > > boots fine. (well, I hope I verified... i'm git-ignorant)
> > > >
> > > > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the
> problem?
> > >
> > > It's already set.
> >
> > It's not set in the config file you sent to me and the list :)
> >
>
> I'm having a hard lockup to when I fire up xmms in X (maybe some other
> apps too) I'm bisecting at the moment.
> CONFIG_SYSFS_DEPRECIATED is set to Y here too, so I might have another
> problem here.. let's see what bisecting will show up..
>

seems to be a "wrong" alarm here, it was caused by a new xorg.conf and
an intel 855gm chipset.

Markus

2007-02-17 04:56:11

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Sat, Feb 17, 2007 at 04:04:52AM +0100, Markus Rechberger wrote:
> On 2/17/07, Markus Rechberger <[email protected]> wrote:
> >On 2/17/07, Greg KH <[email protected]> wrote:
> >> On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> >> > On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> >> > > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> >> > > > Greetings,
> >> > > >
> >> > > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> >> > > > fingered the commit below, which I verified via git bisect reset;
> >git
> >> > > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> >> > > > boots fine. (well, I hope I verified... i'm git-ignorant)
> >> > >
> >> > > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the
> >problem?
> >> >
> >> > It's already set.
> >>
> >> It's not set in the config file you sent to me and the list :)
> >>
> >
> >I'm having a hard lockup to when I fire up xmms in X (maybe some other
> >apps too) I'm bisecting at the moment.
> >CONFIG_SYSFS_DEPRECIATED is set to Y here too, so I might have another
> >problem here.. let's see what bisecting will show up..
> >
>
> seems to be a "wrong" alarm here, it was caused by a new xorg.conf and
> an intel 855gm chipset.

That's good, thanks for letting us know.

greg k-h

2007-02-17 08:20:15

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Fri, 2007-02-16 at 17:50 -0800, Greg KH wrote:
> On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> > On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> > > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > > > Greetings,
> > > >
> > > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > > > fingered the commit below, which I verified via git bisect reset; git
> > > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > > > boots fine. (well, I hope I verified... i'm git-ignorant)
> > >
> > > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?
> >
> > It's already set.
>
> It's not set in the config file you sent to me and the list :)

Oops. (Lysdexic mouse, or friends+Aerosmith+beer+2:30A.M.:) Makes no
difference. Nada from nmi_watchdog either btw.

-Mike

2007-02-18 08:02:44

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Sat, 2007-02-17 at 09:20 +0100, Mike Galbraith wrote:
> On Fri, 2007-02-16 at 17:50 -0800, Greg KH wrote:
> > On Sat, Feb 17, 2007 at 02:38:08AM +0100, Mike Galbraith wrote:
> > > On Fri, 2007-02-16 at 14:36 -0800, Greg KH wrote:
> > > > On Fri, Feb 16, 2007 at 10:55:10AM +0100, Mike Galbraith wrote:
> > > > > Greetings,
> > > > >
> > > > > Per $subject, git.yesterday hangs hard on boot here. A git bisect
> > > > > fingered the commit below, which I verified via git bisect reset; git
> > > > > revert -n 725522b5453dd680412f2b6463a988e4fd148757, after which box
> > > > > boots fine. (well, I hope I verified... i'm git-ignorant)
> > > >
> > > > If you change CONFIG_SYSFS_DEPRECATED to Y, does that solve the problem?
> > >
> > > It's already set.
> >
> > It's not set in the config file you sent to me and the list :)
>
> Oops. (Lysdexic mouse, or friends+Aerosmith+beer+2:30A.M.:) Makes no
> difference. Nada from nmi_watchdog either btw.

The reason it's hanging is that nobody releases the driver, so we wait
forever in driver_unregister(). With the below, box boots fine...

--- drivers/base/bus.c.org 2007-02-18 08:38:57.000000000 +0100
+++ drivers/base/bus.c 2007-02-18 08:39:09.000000000 +0100
@@ -593,6 +593,7 @@ void bus_remove_driver(struct device_dri
driver_detach(drv);
module_remove_driver(drv);
kobject_unregister(&drv->kobj);
+ driver_release(&drv->kobj);
put_bus(drv->bus);
}


...but that can't be right given that the darn thing booted just fine
prior to the naming patch with an equally unhappy init_ipmi_si(). Hmm.

-Mike

2007-02-18 09:27:37

by Mike Galbraith

[permalink] [raw]
Subject: Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Sun, 2007-02-18 at 09:02 +0100, Mike Galbraith wrote:

> The reason it's hanging is that nobody releases the driver, so we wait
> forever in driver_unregister(). With the below, box boots fine...
>
> --- drivers/base/bus.c.org 2007-02-18 08:38:57.000000000 +0100
> +++ drivers/base/bus.c 2007-02-18 08:39:09.000000000 +0100
> @@ -593,6 +593,7 @@ void bus_remove_driver(struct device_dri
> driver_detach(drv);
> module_remove_driver(drv);
> kobject_unregister(&drv->kobj);
> + driver_release(&drv->kobj);
> put_bus(drv->bus);
> }
>
>
> ...but that can't be right given that the darn thing booted just fine
> prior to the naming patch with an equally unhappy init_ipmi_si(). Hmm.

Ok. The path it's supposed to take to driver_release() goes like so....

[ 17.495312] bus platform: add driver ipmi
[ 17.506560] ipmi message handler version 39.1
[ 17.518099] ipmi device interface
[ 17.528491] device class 'ipmi': registering
[ 17.539854] bus platform: add driver ipmi_si
[ 17.551210] IPMI System Interface driver.
[ 17.562242] bus pci: add driver ipmi_si
[ 17.583686] bus pci: remove driver ipmi_si
[ 17.594721] BUG: at drivers/base/bus.c:65 driver_release()
[ 17.607224] [<c0105136>] show_trace_log_lvl+0x1a/0x30
[ 17.619434] [<c0105862>] show_trace+0x12/0x14
[ 17.630822] [<c0105906>] dump_stack+0x16/0x18
[ 17.642098] [<c034b632>] driver_release+0x37/0x39
[ 17.653703] [<c02c73b9>] kobject_cleanup+0x43/0x64
[ 17.665359] [<c02c73e5>] kobject_release+0xb/0xd
[ 17.676748] [<c02c8017>] kref_put+0x28/0x8c
[ 17.687626] [<c02c7374>] kobject_put+0x14/0x16
[ 17.698712] [<c02c74c4>] kobject_unregister+0x22/0x25
[ 17.710359] [<c034b7e0>] bus_remove_driver+0x95/0xa5
[ 17.721911] [<c034c87b>] driver_unregister+0xe/0x47
[ 17.733317] [<c02d59ac>] pci_unregister_driver+0x13/0x73
[ 17.745149] [<c033e141>] init_ipmi_si+0x798/0x7ba
[ 17.756339] [<c065b58c>] init+0x114/0x23c
[ 17.766748] [<c0104dab>] kernel_thread_helper+0x7/0x1c

...so I guess it's a ref counting problem somewhere.

-Mike

2007-02-19 06:24:58

by Mike Galbraith

[permalink] [raw]
Subject: [patch] Re: 2.6.20.git regression: 'PCI: add the sysfs driver name to all modules' causes hard hang on boot

On Sun, 2007-02-18 at 10:27 +0100, Mike Galbraith wrote:
> On Sun, 2007-02-18 at 09:02 +0100, Mike Galbraith wrote:
>
> > The reason it's hanging is that nobody releases the driver, so we wait
> > forever in driver_unregister(). With the below, box boots fine...
> >
> > --- drivers/base/bus.c.org 2007-02-18 08:38:57.000000000 +0100
> > +++ drivers/base/bus.c 2007-02-18 08:39:09.000000000 +0100
> > @@ -593,6 +593,7 @@ void bus_remove_driver(struct device_dri
> > driver_detach(drv);
> > module_remove_driver(drv);
> > kobject_unregister(&drv->kobj);
> > + driver_release(&drv->kobj);
> > put_bus(drv->bus);
> > }
> >
> >
> > ...but that can't be right given that the darn thing booted just fine
> > prior to the naming patch with an equally unhappy init_ipmi_si(). Hmm.
>
> Ok. The path it's supposed to take to driver_release() goes like so....
>
> [ 17.495312] bus platform: add driver ipmi
> [ 17.506560] ipmi message handler version 39.1
> [ 17.518099] ipmi device interface
> [ 17.528491] device class 'ipmi': registering
> [ 17.539854] bus platform: add driver ipmi_si
> [ 17.551210] IPMI System Interface driver.
> [ 17.562242] bus pci: add driver ipmi_si
> [ 17.583686] bus pci: remove driver ipmi_si
> [ 17.594721] BUG: at drivers/base/bus.c:65 driver_release()
> [ 17.607224] [<c0105136>] show_trace_log_lvl+0x1a/0x30
> [ 17.619434] [<c0105862>] show_trace+0x12/0x14
> [ 17.630822] [<c0105906>] dump_stack+0x16/0x18
> [ 17.642098] [<c034b632>] driver_release+0x37/0x39
> [ 17.653703] [<c02c73b9>] kobject_cleanup+0x43/0x64
> [ 17.665359] [<c02c73e5>] kobject_release+0xb/0xd
> [ 17.676748] [<c02c8017>] kref_put+0x28/0x8c
> [ 17.687626] [<c02c7374>] kobject_put+0x14/0x16
> [ 17.698712] [<c02c74c4>] kobject_unregister+0x22/0x25
> [ 17.710359] [<c034b7e0>] bus_remove_driver+0x95/0xa5
> [ 17.721911] [<c034c87b>] driver_unregister+0xe/0x47
> [ 17.733317] [<c02d59ac>] pci_unregister_driver+0x13/0x73
> [ 17.745149] [<c033e141>] init_ipmi_si+0x798/0x7ba
> [ 17.756339] [<c065b58c>] init+0x114/0x23c
> [ 17.766748] [<c0104dab>] kernel_thread_helper+0x7/0x1c
>
> ...so I guess it's a ref counting problem somewhere.

The below fixes a reference counting bug exposed by commit
725522b5453dd680412f2b6463a988e4fd148757. If driver.mod_name exists, we
take a reference in module_add_driver(), and never release it. Undo
that reference in module_remove_driver().

My box now boots fine, and modprobe/rmmod didn't explode, so I'll add a
blame line.

Signed-off-by: Mike Galbraith <[email protected]>

--- a/kernel/module.c.org 2007-02-19 06:41:02.000000000 +0100
+++ b/kernel/module.c 2007-02-19 06:49:08.000000000 +0100
@@ -2417,6 +2417,12 @@ void module_remove_driver(struct device_
kfree(driver_name);
}
}
+ /*
+ * Undo the additional reference we added in module_add_driver()
+ * via kset_find_obj()
+ */
+ if (drv->mod_name)
+ kobject_put(&drv->kobj);
}
EXPORT_SYMBOL(module_remove_driver);