2006-08-04 19:04:03

by Michal Piotrowski

[permalink] [raw]
Subject: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

Hi,

This was reported
http://www.ussg.iu.edu/hypermail/linux/kernel/0607.3/1867.html
http://www.ussg.iu.edu/hypermail/linux/kernel/0608.0/0675.html

It's time to use git-bisect.

BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()
[<c0132a04>] unlock_cpu_hotplug+0x2c/0x54
[<c013a2ec>] stop_machine_run+0x2e/0x34
[<c0135686>] sys_init_module+0x15a0/0x178a
[<c015b7b7>] do_sync_read+0xb6/0xf1
[<c0102d51>] sysenter_past_esp+0x56/0x79

Here is a config file
http://www.stardust.webpages.pl/files/2.6-git/18-rc3/config
Here is dmesg http://www.stardust.webpages.pl/files/2.6-git/18-rc3/dmesg

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)


2006-08-04 19:25:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()



On Fri, 4 Aug 2006, Michal Piotrowski wrote:
>
> This was reported
> http://www.ussg.iu.edu/hypermail/linux/kernel/0607.3/1867.html
> http://www.ussg.iu.edu/hypermail/linux/kernel/0608.0/0675.html
>
> It's time to use git-bisect.
>
> BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

Ok, that really is a pretty scary thing. The warning happens if the lock
is released by somebody else than the person who took it. I don't _think_
that's supposed to happen, but with cpu hotplug locking being as
"creative" as it is, maybe it's valid.

I was planning on shutting the cpu hotplug warnings up for v2.6.18 - at
least the "lukewarm IQ" one for _taking_ the lock recursively. But I had
planned on leaving the unlock_cpu_hotplug() in place, since up until now
it hadn't triggered, and it really smells like a bug if it does.

> [<c0132a04>] unlock_cpu_hotplug+0x2c/0x54
> [<c013a2ec>] stop_machine_run+0x2e/0x34
> [<c0135686>] sys_init_module+0x15a0/0x178a
> [<c015b7b7>] do_sync_read+0xb6/0xf1
> [<c0102d51>] sysenter_past_esp+0x56/0x79


DaveJ, any ideas?

Linus

2006-08-04 22:24:08

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 12:24:57PM -0700, Linus Torvalds wrote:

> On Fri, 4 Aug 2006, Michal Piotrowski wrote:
> > This was reported
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0607.3/1867.html
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0608.0/0675.html
> >
> > It's time to use git-bisect.
> >
> > BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()
>
> Ok, that really is a pretty scary thing. The warning happens if the lock
> is released by somebody else than the person who took it. I don't _think_
> that's supposed to happen, but with cpu hotplug locking being as
> "creative" as it is, maybe it's valid.
>
> I was planning on shutting the cpu hotplug warnings up for v2.6.18 - at
> least the "lukewarm IQ" one for _taking_ the lock recursively. But I had
> planned on leaving the unlock_cpu_hotplug() in place, since up until now
> it hadn't triggered, and it really smells like a bug if it does.
>
> > [<c0132a04>] unlock_cpu_hotplug+0x2c/0x54
> > [<c013a2ec>] stop_machine_run+0x2e/0x34
> > [<c0135686>] sys_init_module+0x15a0/0x178a
> > [<c015b7b7>] do_sync_read+0xb6/0xf1
> > [<c0102d51>] sysenter_past_esp+0x56/0x79
>
> DaveJ, any ideas?

Not really :-/
Whilst it's mentioned that the git-cpufreq.patch that was in -mm
made this happen, I really don't see anything in what you merged
yesterday that could explain any of this.

99% of the changes were to longhaul, so we can ignore those,
leaving just:

[CPUFREQ] return error when failing to set minfreq

Looks benign.

[CPUFREQ] Propagate acpi_processor_preregister_performance return value.

Shouldn't affect anything wrt hotplug

[CPUFREQ] [1/2] add __find_governor helper and clean up some error handling.
[CPUFREQ] [2/2] demand load governor modules.

Looks harmless to me, and those new functions don't show up
in any of those functions.

So I'm at a loss to explain why this made this bug suddenly appear, but
I don't think this is anything new. Fixing it however...
I'm looking at it, but I don't see anything obvious yet.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 00:31:49

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 06:24:00PM -0400, Dave Jones wrote:
> > DaveJ, any ideas?
>
> So I'm at a loss to explain why this made this bug suddenly appear, but
> I don't think this is anything new. Fixing it however...
> I'm looking at it, but I don't see anything obvious yet.

Ok, debugging this is getting ridiculous. I dug out a core duo,
built a kernel and watched it spontaneously reboot before it even got to init.
Digging out a serial cable yielded this gem:


Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1020104k/1047108k available (2082k kernel code, 26324k reserved, 1107k data, 240k init, 129604k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.

=============================================
[ INFO: possible recursive locking detected ]
---------------------------------------------
swapper/0 is trying to acquire lock:
(cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

but task is already holding lock:
(cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

other info that might help us debug this:
1 lock held by swapper/0:
#0: (cpu_bitmask_lock){--..}, at: [<c0304e9c>] mutex_lock+0x21/0x24

stack backtrace:
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013ae05>] __lock_acquire+0x74f/0x96d
[<c013b566>] lock_acquire+0x4b/0x6d
[<c0304d35>] __mutex_lock_slowpath+0xb0/0x1f6
[<c0304e9c>] mutex_lock+0x21/0x24
[<c013e8d4>] lock_cpu_hotplug+0x4a/0x4c
[<c016bed3>] kmem_cache_create+0x61/0x577
[<c04af5a1>] kmem_cache_init+0x177/0x38c
[<c049c5f3>] start_kernel+0x221/0x3b3
[<c0100210>] 0xc0100210
DWARF2 unwinder stuck at 0xc0100210
Leftover inexact backtrace:
=======================
=======================
=======================
bad: scheduling from the idle thread!
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c03035a6>] schedule+0xa6/0x9d3
[<c0304d95>] __mutex_lock_slowpath+0x110/0x1f6
[<c0304e9c>] mutex_lock+0x21/0x24
[<c013e8d4>] lock_cpu_hotplug+0x4a/0x4c
[<c016bed3>] kmem_cache_create+0x61/0x577
[<c04af5a1>] kmem_cache_init+0x177/0x38c
[<c049c5f3>] start_kernel+0x221/0x3b3
[<c0100210>] 0xc0100210
DWARF2 unwinder stuck at 0xc0100210
Leftover inexact backtrace:
=======================
=======================
=======================

<Insert reboot here>

Waaaay before cpufreq even enters the picture.

Sigh.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 02:10:59

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 08:31:42PM -0400, Dave Jones wrote:
> On Fri, Aug 04, 2006 at 06:24:00PM -0400, Dave Jones wrote:
> > > DaveJ, any ideas?
> >
> > So I'm at a loss to explain why this made this bug suddenly appear, but
> > I don't think this is anything new. Fixing it however...
> > I'm looking at it, but I don't see anything obvious yet.
>
> Ok, debugging this is getting ridiculous. I dug out a core duo,
> built a kernel and watched it spontaneously reboot before it even got to init.

Hmm, this went away after a ccache flush & a rebuild. Old crud left behind
from earlier debugging I guess.

So I added this diff..

diff --git a/kernel/cpu.c b/kernel/cpu.c
index f230f9a..31210a0 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -31,6 +31,14 @@ void lock_cpu_hotplug(void)
{
struct task_struct *tsk = current;

+ if (jiffies > HZ * 60) {
+ printk ("CPU%d called lock_cpu_hotplug() for app %s. recursive_depth=%d\n",
+ smp_processor_id(),
+ current->comm,
+ recursive_depth);
+ dump_stack();
+ }
+
if (tsk == recursive) {
static int warnings = 10;
if (warnings) {
@@ -42,18 +50,30 @@ void lock_cpu_hotplug(void)
return;
}
mutex_lock(&cpu_bitmask_lock);
+ if (jiffies > HZ * 60)
+ printk ("%s acquired cpu_bitmask_lock\n", current->comm);
recursive = tsk;
}
EXPORT_SYMBOL_GPL(lock_cpu_hotplug);

void unlock_cpu_hotplug(void)
{
+ if (jiffies > HZ * 120) {
+ printk ("CPU%d called unlock_cpu_hotplug() for app %s. recursive_depth=%d\n",
+ smp_processor_id(),
+ current->comm,
+ recursive_depth);
+ dump_stack();
+ }
+
WARN_ON(recursive != current);
if (recursive_depth) {
recursive_depth--;
return;
}
mutex_unlock(&cpu_bitmask_lock);
+ if (jiffies > HZ * 120)
+ printk ("%s released cpu_bitmask_lock\n", current->comm);
recursive = NULL;
}
EXPORT_SYMBOL_GPL(unlock_cpu_hotplug);

In the hope to better understand what the hell is going on.
Here's what happened when I started the cpuspeed service

(That DWARF unwinder stuck thing is really darned annoying btw)

CPU1 called lock_cpu_hotplug() for app modprobe. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c01499c8>] stop_machine_run+0x11/0x38
[<c0141943>] sys_init_module+0x166d/0x1847
[<c0103db9>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c01499c8>] stop_machine_run+0x11/0x38
[<c0141943>] sys_init_module+0x166d/0x1847
[<c0103db9>] sysenter_past_esp+0x56/0x8d
modprobe acquired cpu_bitmask_lock

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c029fbae>] store_scaling_governor+0x142/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c029fbae>] store_scaling_governor+0x142/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
cpuspeed acquired cpu_bitmask_lock

I have questions.

1. How come the printk at the end of unlock_cpu_hotplug() never
got hit when modprobe got to the end of stop_machine_run() ?

2. How come another process acquired this mutex when the previous
one aparently never released it ?

Moving on ...

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f8faf34b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f8faf34b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8fc>] lock_cpu_hotplug+0x72/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f8faf34b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8fc>] lock_cpu_hotplug+0x72/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f8faf34b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d


Ok, now that it's done insulting my IQ, recursive_depth should be 1.
However ...


CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c029fbae>] store_scaling_governor+0x142/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d


WTF is going on here ?
In the whole log, recursive_depth is always 0.


At this stage, it gets into a death spiral printing out the same
crap over and over again. I put the full log at
http://people.redhat.com/davej/hotplug-wtf.txt

Note how the acquire/releases don't match up at all. Scary.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 02:24:07

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 10:10:51PM -0400, Dave Jones wrote:

> So I added this diff..
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index f230f9a..31210a0 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -31,6 +31,14 @@ void lock_cpu_hotplug(void)
> {
> struct task_struct *tsk = current;
>
> + if (jiffies > HZ * 60) {

....

> void unlock_cpu_hotplug(void)
> {
> + if (jiffies > HZ * 120) {
^^^
Duh. Everything becomes clearer the moment you post a diff to lkml.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 02:49:53

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 10:23:56PM -0400, Dave Jones wrote:

> Duh. Everything becomes clearer the moment you post a diff to lkml.

Right, with that silly thinko out of the way, things make _slightly_
more sense, but I'm still puzzled. Here's the trace (with the DWARF
noise stripped out).

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c029fbae>] store_scaling_governor+0x142/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
cpuspeed acquired cpu_bitmask_lock

CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
[<c0132f3c>] __create_workqueue+0x52/0x122
[<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()

CPU1 called unlock_cpu_hotplug() for app cpuspeed. recursive_depth=1
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e980>] unlock_cpu_hotplug+0x37/0xb7
[<c0132fea>] __create_workqueue+0x100/0x122
[<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
[<c029f7b6>] __cpufreq_governor+0x57/0xd8
[<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
[<c029fbc5>] store_scaling_governor+0x159/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d

CPU1 called unlock_cpu_hotplug() for app cpuspeed. recursive_depth=0
[<c0104edc>] show_trace_log_lvl+0x58/0x152
[<c01054c2>] show_trace+0xd/0x10
[<c01055db>] dump_stack+0x19/0x1b
[<c013e980>] unlock_cpu_hotplug+0x37/0xb7
[<c029fbe5>] store_scaling_governor+0x179/0x1a3
[<c029f1a5>] store+0x37/0x48
[<c01a6561>] sysfs_write_file+0xab/0xd1
[<c016f99f>] vfs_write+0xab/0x157
[<c016ffe4>] sys_write+0x3b/0x60
[<c0103db9>] sysenter_past_esp+0x56/0x8d

So in these traces we're seeing
lock
lock
unlock
unlock

But what I really don't understand is the ordering here.
Immediately after that first trace we should see an unlock.
store_scaling_governor does a lock/unlock pair, with no chance
of returning with the hotplug lock still held.

In the second trace however, cpuspeed is off doing something
completely different.

How can this happen?

My head hurts.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 06:47:34

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Fri, Aug 04, 2006 at 10:49:47PM -0400, Dave Jones wrote:

This trace now makes a lot more sense to me.

> CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> [<c0104edc>] show_trace_log_lvl+0x58/0x152
> [<c01054c2>] show_trace+0xd/0x10
> [<c01055db>] dump_stack+0x19/0x1b
> [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> [<c029fbae>] store_scaling_governor+0x142/0x1a3
> [<c029f1a5>] store+0x37/0x48
> [<c01a6561>] sysfs_write_file+0xab/0xd1
> [<c016f99f>] vfs_write+0xab/0x157
> [<c016ffe4>] sys_write+0x3b/0x60
> [<c0103db9>] sysenter_past_esp+0x56/0x8d
> cpuspeed acquired cpu_bitmask_lock
>
> CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> [<c0104edc>] show_trace_log_lvl+0x58/0x152
> [<c01054c2>] show_trace+0xd/0x10
> [<c01055db>] dump_stack+0x19/0x1b
> [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> [<c0132f3c>] __create_workqueue+0x52/0x122
> [<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
> [<c029f7b6>] __cpufreq_governor+0x57/0xd8
> [<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
> [<c029fbc5>] store_scaling_governor+0x159/0x1a3
> [<c029f1a5>] store+0x37/0x48
> [<c01a6561>] sysfs_write_file+0xab/0xd1
> [<c016f99f>] vfs_write+0xab/0x157
> [<c016ffe4>] sys_write+0x3b/0x60
> [<c0103db9>] sysenter_past_esp+0x56/0x8d
> Lukewarm IQ detected in hotplug locking
> BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()

So when we write to sysfs to set the governor, we end up in store_scaling_governor()
which takes the hotplug lock, and then calls off into the governor to let it
do its thing. Part of ondemand's "thing" is to create a workqueue.
unfortunatly, __create_workqueue also takes the hotplug lock.

Creating a variant of __create_workqueue that doesn't take the lock
seems really nasty.

We could remove the locking from store_scaling_governor() and make the governors
themselves have to do the locking, but I'm not sure that's entirely safe.

We could do something really disgusting like ...

unlock_cpu_hotplug()
...
create_workqueue()
...
lock_cpu_hotplug()

in ondemand, which opens up a tiny race window, but as ugly as it is,
looks to be the best solution of the bunch right now.

Comments?

The really sad part is this is completely unrelated to the original bug reported
in this thread, which shows just how widespread this braindamage is.
Michal's traces really don't really scream anything obvious to me.
(Though given it took me 4 hours to decode my own traces above, this is no
real sign of how big a problem this might be).

Michal, could you apply this diff.. http://lkml.org/lkml/diff/2006/8/4/381/1
(change the '120' to '60' first), and send me the debug spew that you get ?
You'll have to wait until a minute of uptime has passed. Oh, and edit
include/linux/jiffies.h to change INITIAL_JIFFIES to '0'.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 07:46:12

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, 5 Aug 2006 02:47:28 -0400
Dave Jones <[email protected]> wrote:

> Creating a variant of __create_workqueue that doesn't take the lock
> seems really nasty.

Honestly, the default fix for lock_cpu_hotplug() problems should be to
delete the lock_cpu_hotplug() calls and to implement local locking.

Here's a not-runtime-tested workqueue.c example.



From: Andrew Morton <[email protected]>

Use a private lock instead. It protects all per-cpu data structures in
workqueue.c, including the workqueues list.

Fix a bug in schedule_on_each_cpu(): it was forgetting to lock down the
per-cpu resources.

Unfixed long-standing bug: if someone unplugs the CPU identified by
`singlethread_cpu' the kernel will get very sick.

Signed-off-by: Andrew Morton <[email protected]>
---

kernel/workqueue.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)

diff -puN kernel/workqueue.c~a kernel/workqueue.c
--- a/kernel/workqueue.c~a
+++ a/kernel/workqueue.c
@@ -68,7 +68,7 @@ struct workqueue_struct {

/* All the per-cpu workqueues on the system, for hotplug cpu to add/remove
threads to each one as cpus come/go. */
-static DEFINE_SPINLOCK(workqueue_lock);
+static DEFINE_MUTEX(workqueue_mutex);
static LIST_HEAD(workqueues);

static int singlethread_cpu;
@@ -320,10 +320,10 @@ void fastcall flush_workqueue(struct wor
} else {
int cpu;

- lock_cpu_hotplug();
+ mutex_lock(&workqueue_mutex);
for_each_online_cpu(cpu)
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
- unlock_cpu_hotplug();
+ mutex_unlock(&workqueue_mutex);
}
}
EXPORT_SYMBOL_GPL(flush_workqueue);
@@ -371,8 +371,7 @@ struct workqueue_struct *__create_workqu
}

wq->name = name;
- /* We don't need the distraction of CPUs appearing and vanishing. */
- lock_cpu_hotplug();
+ mutex_lock(&workqueue_mutex);
if (singlethread) {
INIT_LIST_HEAD(&wq->list);
p = create_workqueue_thread(wq, singlethread_cpu);
@@ -381,9 +380,7 @@ struct workqueue_struct *__create_workqu
else
wake_up_process(p);
} else {
- spin_lock(&workqueue_lock);
list_add(&wq->list, &workqueues);
- spin_unlock(&workqueue_lock);
for_each_online_cpu(cpu) {
p = create_workqueue_thread(wq, cpu);
if (p) {
@@ -393,7 +390,7 @@ struct workqueue_struct *__create_workqu
destroy = 1;
}
}
- unlock_cpu_hotplug();
+ mutex_unlock(&workqueue_mutex);

/*
* Was there any error during startup? If yes then clean up:
@@ -434,17 +431,15 @@ void destroy_workqueue(struct workqueue_
flush_workqueue(wq);

/* We don't need the distraction of CPUs appearing and vanishing. */
- lock_cpu_hotplug();
+ mutex_lock(&workqueue_mutex);
if (is_single_threaded(wq))
cleanup_workqueue_thread(wq, singlethread_cpu);
else {
for_each_online_cpu(cpu)
cleanup_workqueue_thread(wq, cpu);
- spin_lock(&workqueue_lock);
list_del(&wq->list);
- spin_unlock(&workqueue_lock);
}
- unlock_cpu_hotplug();
+ mutex_unlock(&workqueue_mutex);
free_percpu(wq->cpu_wq);
kfree(wq);
}
@@ -515,11 +510,13 @@ int schedule_on_each_cpu(void (*func)(vo
if (!works)
return -ENOMEM;

+ mutex_lock(&workqueue_mutex);
for_each_online_cpu(cpu) {
INIT_WORK(per_cpu_ptr(works, cpu), func, info);
__queue_work(per_cpu_ptr(keventd_wq->cpu_wq, cpu),
per_cpu_ptr(works, cpu));
}
+ mutex_unlock(&workqueue_mutex);
flush_workqueue(keventd_wq);
free_percpu(works);
return 0;
@@ -635,6 +632,7 @@ static int __devinit workqueue_cpu_callb

switch (action) {
case CPU_UP_PREPARE:
+ mutex_lock(&workqueue_mutex);
/* Create a new workqueue thread for it. */
list_for_each_entry(wq, &workqueues, list) {
if (!create_workqueue_thread(wq, hotcpu)) {
@@ -653,6 +651,7 @@ static int __devinit workqueue_cpu_callb
kthread_bind(cwq->thread, hotcpu);
wake_up_process(cwq->thread);
}
+ mutex_unlock(&workqueue_mutex);
break;

case CPU_UP_CANCELED:
@@ -664,6 +663,15 @@ static int __devinit workqueue_cpu_callb
any_online_cpu(cpu_online_map));
cleanup_workqueue_thread(wq, hotcpu);
}
+ mutex_unlock(&workqueue_mutex);
+ break;
+
+ case CPU_DOWN_PREPARE:
+ mutex_lock(&workqueue_mutex);
+ break;
+
+ case CPU_DOWN_FAILED:
+ mutex_unlock(&workqueue_mutex);
break;

case CPU_DEAD:
@@ -671,6 +679,7 @@ static int __devinit workqueue_cpu_callb
cleanup_workqueue_thread(wq, hotcpu);
list_for_each_entry(wq, &workqueues, list)
take_over_work(wq, hotcpu);
+ mutex_unlock(&workqueue_mutex);
break;
}

_

2006-08-05 10:54:12

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On 05/08/06, Dave Jones <[email protected]> wrote:
> On Fri, Aug 04, 2006 at 10:49:47PM -0400, Dave Jones wrote:
>
> This trace now makes a lot more sense to me.
>
> > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > [<c01054c2>] show_trace+0xd/0x10
> > [<c01055db>] dump_stack+0x19/0x1b
> > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > [<c029fbae>] store_scaling_governor+0x142/0x1a3
> > [<c029f1a5>] store+0x37/0x48
> > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > [<c016f99f>] vfs_write+0xab/0x157
> > [<c016ffe4>] sys_write+0x3b/0x60
> > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > cpuspeed acquired cpu_bitmask_lock
> >
> > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > [<c01054c2>] show_trace+0xd/0x10
> > [<c01055db>] dump_stack+0x19/0x1b
> > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > [<c0132f3c>] __create_workqueue+0x52/0x122
> > [<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
> > [<c029f7b6>] __cpufreq_governor+0x57/0xd8
> > [<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
> > [<c029fbc5>] store_scaling_governor+0x159/0x1a3
> > [<c029f1a5>] store+0x37/0x48
> > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > [<c016f99f>] vfs_write+0xab/0x157
> > [<c016ffe4>] sys_write+0x3b/0x60
> > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > Lukewarm IQ detected in hotplug locking
> > BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()
>
> So when we write to sysfs to set the governor, we end up in store_scaling_governor()
> which takes the hotplug lock, and then calls off into the governor to let it
> do its thing. Part of ondemand's "thing" is to create a workqueue.
> unfortunatly, __create_workqueue also takes the hotplug lock.
>
> Creating a variant of __create_workqueue that doesn't take the lock
> seems really nasty.
>
> We could remove the locking from store_scaling_governor() and make the governors
> themselves have to do the locking, but I'm not sure that's entirely safe.
>
> We could do something really disgusting like ...
>
> unlock_cpu_hotplug()
> ...
> create_workqueue()
> ...
> lock_cpu_hotplug()
>
> in ondemand, which opens up a tiny race window, but as ugly as it is,
> looks to be the best solution of the bunch right now.
>
> Comments?
>
> The really sad part is this is completely unrelated to the original bug reported
> in this thread, which shows just how widespread this braindamage is.
> Michal's traces really don't really scream anything obvious to me.
> (Though given it took me 4 hours to decode my own traces above, this is no
> real sign of how big a problem this might be).
>
> Michal, could you apply this diff.. http://lkml.org/lkml/diff/2006/8/4/381/1
> (change the '120' to '60' first), and send me the debug spew that you get ?
> You'll have to wait until a minute of uptime has passed. Oh, and edit
> include/linux/jiffies.h to change INITIAL_JIFFIES to '0'.

p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
ip_tables: (C) 2000-2006 Netfilter Core Team
Netfilter messages via NETLINK v0.30.
BUG: warning at /usr/src/linux-git/kernel/cpu.c:69/unlock_cpu_hotplug()
[<c0132a91>] unlock_cpu_hotplug+0x63/0xb2
[<c013a3a0>] stop_machine_run+0x2e/0x34
[<c013573a>] sys_init_module+0x15a0/0x178a
[<c015b86b>] do_sync_read+0xb6/0xf1
[<c0102d51>] sysenter_past_esp+0x56/0x79
ip_conntrack version 2.4 (8192 buckets, 65536 max) - 224 bytes per conntrack
NET: Registered protocol family 17
skge eth0: enabling interface
skge eth0: Link is up at 100 Mbps, full duplex, flow control tx and rx
w83627hf 9191-0290: Reading VID from GPIO5
NET: Registered protocol family 10
IPv6 over IPv4 tunneling driver
audit(1154774854.770:5): avc: denied { write } for pid=1360
comm="cpuspeed" name="cpufreq" dev=sysfs ino=4863
scontext=system_u:system_r:cpuspeed_t:s0
tcontext=system_u:object_r:sysfs_t:s0 tclass=dir
eth0: no IPv6 routers present
CPU0 called lock_cpu_hotplug() for app amarokapp. recursive_depth=0
[<c01329ab>] lock_cpu_hotplug+0x36/0xb9
[<c01182ce>] sched_getaffinity+0xf/0x83
[<c0118361>] sys_sched_getaffinity+0x1f/0x41
[<c0102d51>] sysenter_past_esp+0x56/0x79
amarokapp acquired cpu_bitmask_lock
CPU0 called unlock_cpu_hotplug() for app amarokapp. recursive_depth=0
[<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
[<c0118323>] sched_getaffinity+0x64/0x83
[<c0118361>] sys_sched_getaffinity+0x1f/0x41
[<c0102d51>] sysenter_past_esp+0x56/0x79
amarokapp released cpu_bitmask_lock
CPU0 called lock_cpu_hotplug() for app amarokapp. recursive_depth=0
[<c01329ab>] lock_cpu_hotplug+0x36/0xb9
[<c01195ca>] sched_setaffinity+0xf/0xd5
[<c01196cb>] sys_sched_setaffinity+0x3b/0x41
[<c0102d51>] sysenter_past_esp+0x56/0x79
amarokapp acquired cpu_bitmask_lock
CPU0 called unlock_cpu_hotplug() for app amarokapp. recursive_depth=0
[<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
[<c0119689>] sched_setaffinity+0xce/0xd5
[<c01196cb>] sys_sched_setaffinity+0x3b/0x41
[<c0102d51>] sysenter_past_esp+0x56/0x79
amarokapp released cpu_bitmask_lock

>
> Dave
>
> --
> http://www.codemonkey.org.uk
>

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

2006-08-05 11:11:18

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On 05/08/06, Michal Piotrowski <[email protected]> wrote:
> On 05/08/06, Dave Jones <[email protected]> wrote:
> > On Fri, Aug 04, 2006 at 10:49:47PM -0400, Dave Jones wrote:
> >
> > This trace now makes a lot more sense to me.
> >
> > > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > > [<c01054c2>] show_trace+0xd/0x10
> > > [<c01055db>] dump_stack+0x19/0x1b
> > > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > > [<c029fbae>] store_scaling_governor+0x142/0x1a3
> > > [<c029f1a5>] store+0x37/0x48
> > > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > > [<c016f99f>] vfs_write+0xab/0x157
> > > [<c016ffe4>] sys_write+0x3b/0x60
> > > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > > cpuspeed acquired cpu_bitmask_lock
> > >
> > > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > > [<c01054c2>] show_trace+0xd/0x10
> > > [<c01055db>] dump_stack+0x19/0x1b
> > > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > > [<c0132f3c>] __create_workqueue+0x52/0x122
> > > [<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
> > > [<c029f7b6>] __cpufreq_governor+0x57/0xd8
> > > [<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
> > > [<c029fbc5>] store_scaling_governor+0x159/0x1a3
> > > [<c029f1a5>] store+0x37/0x48
> > > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > > [<c016f99f>] vfs_write+0xab/0x157
> > > [<c016ffe4>] sys_write+0x3b/0x60
> > > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > > Lukewarm IQ detected in hotplug locking
> > > BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()
> >
> > So when we write to sysfs to set the governor, we end up in store_scaling_governor()
> > which takes the hotplug lock, and then calls off into the governor to let it
> > do its thing. Part of ondemand's "thing" is to create a workqueue.
> > unfortunatly, __create_workqueue also takes the hotplug lock.
> >
> > Creating a variant of __create_workqueue that doesn't take the lock
> > seems really nasty.
> >
> > We could remove the locking from store_scaling_governor() and make the governors
> > themselves have to do the locking, but I'm not sure that's entirely safe.
> >
> > We could do something really disgusting like ...
> >
> > unlock_cpu_hotplug()
> > ...
> > create_workqueue()
> > ...
> > lock_cpu_hotplug()
> >
> > in ondemand, which opens up a tiny race window, but as ugly as it is,
> > looks to be the best solution of the bunch right now.
> >
> > Comments?
> >
> > The really sad part is this is completely unrelated to the original bug reported
> > in this thread, which shows just how widespread this braindamage is.
> > Michal's traces really don't really scream anything obvious to me.
> > (Though given it took me 4 hours to decode my own traces above, this is no
> > real sign of how big a problem this might be).
> >
> > Michal, could you apply this diff.. http://lkml.org/lkml/diff/2006/8/4/381/1
> > (change the '120' to '60' first), and send me the debug spew that you get ?
> > You'll have to wait until a minute of uptime has passed. Oh, and edit
> > include/linux/jiffies.h to change INITIAL_JIFFIES to '0'.
>
> p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
> ip_tables: (C) 2000-2006 Netfilter Core Team
> Netfilter messages via NETLINK v0.30.
> BUG: warning at /usr/src/linux-git/kernel/cpu.c:69/unlock_cpu_hotplug()
> [<c0132a91>] unlock_cpu_hotplug+0x63/0xb2
> [<c013a3a0>] stop_machine_run+0x2e/0x34
> [<c013573a>] sys_init_module+0x15a0/0x178a
> [<c015b86b>] do_sync_read+0xb6/0xf1
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> ip_conntrack version 2.4 (8192 buckets, 65536 max) - 224 bytes per conntrack
> NET: Registered protocol family 17
> skge eth0: enabling interface
> skge eth0: Link is up at 100 Mbps, full duplex, flow control tx and rx
> w83627hf 9191-0290: Reading VID from GPIO5
> NET: Registered protocol family 10
> IPv6 over IPv4 tunneling driver
> audit(1154774854.770:5): avc: denied { write } for pid=1360
> comm="cpuspeed" name="cpufreq" dev=sysfs ino=4863
> scontext=system_u:system_r:cpuspeed_t:s0
> tcontext=system_u:object_r:sysfs_t:s0 tclass=dir
> eth0: no IPv6 routers present
> CPU0 called lock_cpu_hotplug() for app amarokapp. recursive_depth=0
> [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
> [<c01182ce>] sched_getaffinity+0xf/0x83
> [<c0118361>] sys_sched_getaffinity+0x1f/0x41
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> amarokapp acquired cpu_bitmask_lock
> CPU0 called unlock_cpu_hotplug() for app amarokapp. recursive_depth=0
> [<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
> [<c0118323>] sched_getaffinity+0x64/0x83
> [<c0118361>] sys_sched_getaffinity+0x1f/0x41
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> amarokapp released cpu_bitmask_lock
> CPU0 called lock_cpu_hotplug() for app amarokapp. recursive_depth=0
> [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
> [<c01195ca>] sched_setaffinity+0xf/0xd5
> [<c01196cb>] sys_sched_setaffinity+0x3b/0x41
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> amarokapp acquired cpu_bitmask_lock
> CPU0 called unlock_cpu_hotplug() for app amarokapp. recursive_depth=0
> [<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
> [<c0119689>] sched_setaffinity+0xce/0xd5
> [<c01196cb>] sys_sched_setaffinity+0x3b/0x41
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> amarokapp released cpu_bitmask_lock

audit(1154775731.471:6): avc: denied { write } for pid=1361
comm="cpuspeed" name="cpufreq" dev=sysfs ino=4863
scontext=system_u:system_r:cpuspeed_t:s0
tcontext=system_u:object_r:sysfs_t:s0 tclass=dir
CPU0 called lock_cpu_hotplug() for app konsole. recursive_depth=0
[<c01329ab>] lock_cpu_hotplug+0x36/0xb9
[<c012aadf>] flush_workqueue+0x25/0x57
[<c020e1b7>] release_dev+0x485/0x5f2
[<c016f7f8>] d_lookup+0x1b/0x3b
[<c012d8d6>] remove_wait_queue+0xc/0x34
[<c020e333>] tty_release+0xf/0x18
[<c015c943>] __fput+0x90/0x15e
[<c015a361>] filp_close+0x4e/0x54
[<c0102d51>] sysenter_past_esp+0x56/0x79
konsole acquired cpu_bitmask_lock
CPU0 called unlock_cpu_hotplug() for app konsole. recursive_depth=0
[<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
[<c020e1b7>] release_dev+0x485/0x5f2
[<c016f7f8>] d_lookup+0x1b/0x3b
[<c012d8d6>] remove_wait_queue+0xc/0x34
[<c020e333>] tty_release+0xf/0x18
[<c015c943>] __fput+0x90/0x15e
[<c015a361>] filp_close+0x4e/0x54
[<c0102d51>] sysenter_past_esp+0x56/0x79
konsole released cpu_bitmask_lock

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

2006-08-05 11:26:50

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On 05/08/06, Michal Piotrowski <[email protected]> wrote:
> audit(1154775731.471:6): avc: denied { write } for pid=1361
> comm="cpuspeed" name="cpufreq" dev=sysfs ino=4863
> scontext=system_u:system_r:cpuspeed_t:s0
> tcontext=system_u:object_r:sysfs_t:s0 tclass=dir
> CPU0 called lock_cpu_hotplug() for app konsole. recursive_depth=0
> [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
> [<c012aadf>] flush_workqueue+0x25/0x57
> [<c020e1b7>] release_dev+0x485/0x5f2
> [<c016f7f8>] d_lookup+0x1b/0x3b
> [<c012d8d6>] remove_wait_queue+0xc/0x34
> [<c020e333>] tty_release+0xf/0x18
> [<c015c943>] __fput+0x90/0x15e
> [<c015a361>] filp_close+0x4e/0x54
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> konsole acquired cpu_bitmask_lock
> CPU0 called unlock_cpu_hotplug() for app konsole. recursive_depth=0
> [<c0132a62>] unlock_cpu_hotplug+0x34/0xb2
> [<c020e1b7>] release_dev+0x485/0x5f2
> [<c016f7f8>] d_lookup+0x1b/0x3b
> [<c012d8d6>] remove_wait_queue+0xc/0x34
> [<c020e333>] tty_release+0xf/0x18
> [<c015c943>] __fput+0x90/0x15e
> [<c015a361>] filp_close+0x4e/0x54
> [<c0102d51>] sysenter_past_esp+0x56/0x79
> konsole released cpu_bitmask_lock

Aug 5 13:18:00 ltg01-fedora kernel: CPU0 called lock_cpu_hotplug()
for app kded. recursive_depth=0
Aug 5 13:18:00 ltg01-fedora kernel: [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
Aug 5 13:18:00 ltg01-fedora kernel: [<c012aadf>] flush_workqueue+0x25/0x57
Aug 5 13:18:00 ltg01-fedora kernel: [<c020e1b7>] release_dev+0x485/0x5f2
Aug 5 13:18:00 ltg01-fedora kernel: [<c014c28f>] do_wp_page+0x318/0x348
Aug 5 13:18:00 ltg01-fedora kernel: [<c01425d4>] find_get_page+0x36/0x3b
Aug 5 13:18:00 ltg01-fedora kernel: [<c014d46b>] __handle_mm_fault+0x6bf/0x727
Aug 5 13:18:00 ltg01-fedora kernel: [<c020e333>] tty_release+0xf/0x18
Aug 5 13:18:00 ltg01-fedora kernel: [<c015c943>] __fput+0x90/0x15e
Aug 5 13:18:00 ltg01-fedora kernel: [<c015a361>] filp_close+0x4e/0x54
Aug 5 13:18:00 ltg01-fedora kernel: [<c0102d51>] sysenter_past_esp+0x56/0x79
Aug 5 13:18:00 ltg01-fedora kernel: kded acquired cpu_bitmask_lock
[..]
Aug 5 13:18:03 ltg01-fedora kernel: CPU1 called lock_cpu_hotplug()
for app mingetty. recursive_depth=0
Aug 5 13:18:03 ltg01-fedora kernel: [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
Aug 5 13:18:03 ltg01-fedora kernel: [<c012aadf>] flush_workqueue+0x25/0x57
Aug 5 13:18:03 ltg01-fedora kernel: [<c020e1b7>] release_dev+0x485/0x5f2
Aug 5 13:18:03 ltg01-fedora kernel: [<c01463a2>] __pagevec_free+0x18/0x22
Aug 5 13:18:03 ltg01-fedora kernel: [<c0148251>] release_pages+0x13b/0x143
Aug 5 13:18:03 ltg01-fedora kernel: [<c020e333>] tty_release+0xf/0x18
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c943>] __fput+0x90/0x15e
Aug 5 13:18:03 ltg01-fedora kernel: [<c015a361>] filp_close+0x4e/0x54
Aug 5 13:18:03 ltg01-fedora kernel: [<c011e477>] put_files_struct+0x65/0xa7
Aug 5 13:18:03 ltg01-fedora kernel: [<c011f44f>] do_exit+0x1dd/0x771
Aug 5 13:18:03 ltg01-fedora kernel: [<c011fa59>] sys_exit_group+0x0/0xd
Aug 5 13:18:03 ltg01-fedora kernel: [<c012766a>]
get_signal_to_deliver+0x387/0x3cc
Aug 5 13:18:03 ltg01-fedora kernel: [<c01024ba>] do_notify_resume+0x71/0x5df
Aug 5 13:18:03 ltg01-fedora kernel: [<c012d8d6>] remove_wait_queue+0xc/0x34
Aug 5 13:18:03 ltg01-fedora kernel: [<c0211a89>] read_chan+0x0/0x567
Aug 5 13:18:03 ltg01-fedora kernel: [<c020cc00>] tty_ldisc_deref+0x50/0x5f
Aug 5 13:18:03 ltg01-fedora kernel: [<c020ede8>] tty_read+0x6d/0xb4
Aug 5 13:18:03 ltg01-fedora kernel: [<c020ed7b>] tty_read+0x0/0xb4
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c174>] vfs_read+0x9f/0x141
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c5c0>] sys_read+0x3c/0x63
Aug 5 13:18:03 ltg01-fedora kernel: [<c0102e4a>] work_notifysig+0x13/0x19
Aug 5 13:18:03 ltg01-fedora kernel: mingetty acquired cpu_bitmask_lock
[..]
Aug 5 13:18:03 ltg01-fedora kernel: CPU1 called lock_cpu_hotplug()
for app mingetty. recursive_depth=0
Aug 5 13:18:03 ltg01-fedora kernel: [<c01329ab>] lock_cpu_hotplug+0x36/0xb9
Aug 5 13:18:03 ltg01-fedora kernel: [<c012aadf>] flush_workqueue+0x25/0x57
Aug 5 13:18:03 ltg01-fedora kernel: [<c020e1b7>] release_dev+0x485/0x5f2
Aug 5 13:18:03 ltg01-fedora kernel: [<c0115c9e>] __wake_up_common+0x2f/0x53
Aug 5 13:18:03 ltg01-fedora kernel: [<c0115cd3>] __wake_up_locked+0x11/0x15
Aug 5 13:18:03 ltg01-fedora kernel: [<c02b5ef5>] __down+0xa1/0xd7
Aug 5 13:18:03 ltg01-fedora kernel: [<c0117186>] default_wake_function+0x0/0xc
Aug 5 13:18:03 ltg01-fedora kernel: [<c020e333>] tty_release+0xf/0x18
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c943>] __fput+0x90/0x15e
Aug 5 13:18:03 ltg01-fedora kernel: [<c015a361>] filp_close+0x4e/0x54
Aug 5 13:18:03 ltg01-fedora kernel: [<c011e477>] put_files_struct+0x65/0xa7
Aug 5 13:18:03 ltg01-fedora kernel: [<c011f44f>] do_exit+0x1dd/0x771
Aug 5 13:18:03 ltg01-fedora kernel: [<c011fa59>] sys_exit_group+0x0/0xd
Aug 5 13:18:03 ltg01-fedora kernel: [<c012766a>]
get_signal_to_deliver+0x387/0x3cc
Aug 5 13:18:03 ltg01-fedora kernel: [<c01024ba>] do_notify_resume+0x71/0x5df
Aug 5 13:18:03 ltg01-fedora kernel: [<c012d8d6>] remove_wait_queue+0xc/0x34
Aug 5 13:18:03 ltg01-fedora kernel: [<c0211a89>] read_chan+0x0/0x567
Aug 5 13:18:03 ltg01-fedora kernel: [<c020cc00>] tty_ldisc_deref+0x50/0x5f
Aug 5 13:18:03 ltg01-fedora kernel: [<c020ede8>] tty_read+0x6d/0xb4
Aug 5 13:18:03 ltg01-fedora kernel: [<c020ed7b>] tty_read+0x0/0xb4
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c174>] vfs_read+0x9f/0x141
Aug 5 13:18:03 ltg01-fedora kernel: [<c015c5c0>] sys_read+0x3c/0x63
Aug 5 13:18:03 ltg01-fedora kernel: [<c0102e4a>] work_notifysig+0x13/0x19

2006-08-05 18:48:03

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, Aug 05, 2006 at 01:26:49PM +0200, Michal Piotrowski wrote:

> Aug 5 13:18:00 ltg01-fedora kernel: CPU0 called lock_cpu_hotplug()
> for app kded. recursive_depth=0
> *more snipped traces*

The interesting ones will be the ones before & after you hit that
BUG: warning at /usr/src/linux-work1/kernel/cpu.c:51/unlock_cpu_hotplug()
if you can make that happen again.

Dave

--
http://www.codemonkey.org.uk

2006-08-05 19:47:36

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, Aug 05, 2006 at 12:46:00AM -0700, Andrew Morton wrote:
> @@ -320,10 +320,10 @@ void fastcall flush_workqueue(struct wor
> } else {
> int cpu;
>
> - lock_cpu_hotplug();
> + mutex_lock(&workqueue_mutex);
> for_each_online_cpu(cpu)
> flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
> - unlock_cpu_hotplug();
> + mutex_unlock(&workqueue_mutex);
> }
> }

Is this enough though? I mean, what stops the hotplug cpu code
from modifying cpu_online_map underneath us here, making for_each_online_cpu
do the wrong thing ?

Dave

--
http://www.codemonkey.org.uk

2006-08-05 20:02:19

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, 5 Aug 2006 15:47:30 -0400
Dave Jones <[email protected]> wrote:

> On Sat, Aug 05, 2006 at 12:46:00AM -0700, Andrew Morton wrote:
> > @@ -320,10 +320,10 @@ void fastcall flush_workqueue(struct wor
> > } else {
> > int cpu;
> >
> > - lock_cpu_hotplug();
> > + mutex_lock(&workqueue_mutex);
> > for_each_online_cpu(cpu)
> > flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
> > - unlock_cpu_hotplug();
> > + mutex_unlock(&workqueue_mutex);
> > }
> > }
>
> Is this enough though? I mean, what stops the hotplug cpu code
> from modifying cpu_online_map underneath us here, making for_each_online_cpu
> do the wrong thing ?
>

The patch takes that mutex in workqueue_cpu_callback() in such a way as to
prevent that. What it _effectively_ does is to change cpu_up() thusly:

--- a/kernel/cpu.c~x
+++ a/kernel/cpu.c
@@ -197,6 +197,7 @@ int __devinit cpu_up(unsigned int cpu)
}

ret = blocking_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
+ mutex_lock(&workqueue_mutex);
if (ret == NOTIFY_BAD) {
printk("%s: attempt to bring up CPU %u failed\n",
__FUNCTION__, cpu);
@@ -213,12 +214,15 @@ int __devinit cpu_up(unsigned int cpu)
BUG_ON(!cpu_online(cpu));

/* Now call notifier in preparation. */
+ mutex_unlock(&workqueue_mutex);
blocking_notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu);

out_notify:
- if (ret != 0)
+ if (ret != 0) {
blocking_notifier_call_chain(&cpu_chain,
CPU_UP_CANCELED, hcpu);
+ mutex_unlock(&workqueue_mutex);
+ }
out:
mutex_unlock(&cpu_add_remove_lock);
return ret;
_


and similarly in cpu_down().

So the workqueue code itself is protecting its per-cpu data from the hot
plug/unplug code across its critical regions.

2006-08-05 21:16:01

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On 05/08/06, Dave Jones <[email protected]> wrote:
> On Sat, Aug 05, 2006 at 01:26:49PM +0200, Michal Piotrowski wrote:
>
> > Aug 5 13:18:00 ltg01-fedora kernel: CPU0 called lock_cpu_hotplug()
> > for app kded. recursive_depth=0
> > *more snipped traces*
>
> The interesting ones will be the ones before & after you hit that
> BUG: warning at /usr/src/linux-work1/kernel/cpu.c:51/unlock_cpu_hotplug()
> if you can make that happen again.

I don't see nothing interesting before BUG: warning at
/usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

Only

CPU0 called lock_cpu_hotplug() for app amarokapp. recursive_depth=0
[<c01329ab>] lock_cpu_hotplug+0x36/0xb9
[<c01182ce>] sched_getaffinity+0xf/0x83
[<c0118361>] sys_sched_getaffinity+0x1f/0x41
[<c0102d51>] sysenter_past_esp+0x56/0x79
amarokapp acquired cpu_bitmask_lock

appears after this warning.

dmesg -> http://www.stardust.webpages.pl/files/2.6-git/18-rc3/dmesg2

>
> Dave
>
> --
> http://www.codemonkey.org.uk
>

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

2006-08-05 22:34:50

by art

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

same for 2.6.18.rc3-git7 on smp amd64 x2 (dualcore)
after
# echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

dmesg

....
SELinux: initialized (dev sdb1, type vfat), uses genfs_contexts
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()

Call Trace:
[<ffffffff802a0e72>] lock_cpu_hotplug+0x50/0x73
[<ffffffff80299a36>] __create_workqueue+0x5d/0x146
[<ffffffff803f437d>] cpufreq_governor_dbs+0xa8/0x2f4
[<ffffffff803f22ac>] __cpufreq_governor+0xa4/0x194
[<ffffffff803f2516>] __cpufreq_set_policy+0x17a/0x1f4
[<ffffffff803f2792>] store_scaling_governor+0x17c/0x1d8
[<ffffffff803f35f8>] handle_update+0x0/0x28
[<ffffffff8026650b>] _spin_unlock_irqrestore+0x15/0x30
[<ffffffff803f1a00>] cpufreq_cpu_get+0xb5/0x143
[<ffffffff80272dbf>] store+0x44/0x5b
[<ffffffff802f1ddc>] sysfs_write_file+0xc5/0xf4
[<ffffffff80215e1b>] vfs_write+0xce/0x174
[<ffffffff802166a9>] sys_write+0x45/0x6e
[<ffffffff8025fa8e>] system_call+0x7e/0x83

------------------------------------------------------
my old message from 2006/08/03


2.6.18-rc3-git1-smp-amd64 - Lukewarm IQ detected in hotplug
what i have in dmesg
Linux version 2.6.18-rc3-git1 ([email protected]) (gcc version 4.1.1 20060525)) #1
SMP PREEMPT Tue Aug 1 09:46:44 CDT 2006
...
AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ stepping 01
...
SELinux: initialized (dev binfmt_misc, type binfmt_misc), uses
genfs_contexts
ip_tables: (C) 2000-2006 Netfilter Core Team
BUG: warning at kernel/cpu.c:51/unlock_cpu_hotplug()
Call Trace:
[<ffffffff802a0e51>] unlock_cpu_hotplug+0x3f/0x6c
[<ffffffff803f3937>] store_speed+0xc2/0xd1
[<ffffffff80272d3f>] store+0x44/0x5b
[<ffffffff802f1c8a>] sysfs_write_file+0xc5/0xf4
[<ffffffff80215e14>] vfs_write+0xce/0x174
[<ffffffff802166a2>] sys_write+0x45/0x6e
[<ffffffff8025fa0e>] system_call+0x7e/0x83
Netfilter messages via NETLINK v0.30.
...
...
SELinux: initialized (dev sdb1, type vfat), uses genfs_contexts
Lukewarm IQ detected in hotplug locking
BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()
Call Trace:
[<ffffffff802a0def>] lock_cpu_hotplug+0x50/0x73
[<ffffffff802999ab>] __create_workqueue+0x5d/0x146
[<ffffffff803f4108>] cpufreq_governor_dbs+0xa8/0x2f4
[<ffffffff803f2006>] __cpufreq_governor+0xa4/0x194
[<ffffffff803f225f>] __cpufreq_set_policy+0x169/0x1d9
[<ffffffff803f24a8>] store_scaling_governor+0x153/0x18d
[<ffffffff803f32ed>] handle_update+0x0/0x28
[<ffffffff8026648b>] _spin_unlock_irqrestore+0x15/0x30
[<ffffffff803f1800>] cpufreq_cpu_get+0xb4/0x143
[<ffffffff80272d3f>] store+0x44/0x5b
[<ffffffff802f1c8a>] sysfs_write_file+0xc5/0xf4
[<ffffffff80215e14>] vfs_write+0xce/0x174
[<ffffffff802166a2>] sys_write+0x45/0x6e
[<ffffffff8025fa0e>] system_call+0x7e/0x83
hdc: CHECK for good STATUS
...
the same in 2.6.18-rc3-git3 any clue ?
-------------------------------------------------

xboom

[email protected]

2006-08-06 01:12:19

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, 5 Aug 2006 00:46:00 -0700
Andrew Morton <[email protected]> wrote:

> On Sat, 5 Aug 2006 02:47:28 -0400
> Dave Jones <[email protected]> wrote:
>
> > Creating a variant of __create_workqueue that doesn't take the lock
> > seems really nasty.
>
> Honestly, the default fix for lock_cpu_hotplug() problems should be to
> delete the lock_cpu_hotplug() calls and to implement local locking.
>
> Here's a not-runtime-tested workqueue.c example.

It's now slightly-runtime-tested. I'll include it in rc3-mm1.

2006-08-06 01:23:23

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sat, Aug 05, 2006 at 05:14:10PM -0500, art wrote:
> same for 2.6.18.rc3-git7 on smp amd64 x2 (dualcore)
> after
> # echo "ondemand" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
>
> dmesg
>
> ....
> SELinux: initialized (dev sdb1, type vfat), uses genfs_contexts
> Lukewarm IQ detected in hotplug locking
> BUG: warning at kernel/cpu.c:38/lock_cpu_hotplug()
>
> Call Trace:
> [<ffffffff802a0e72>] lock_cpu_hotplug+0x50/0x73
> [<ffffffff80299a36>] __create_workqueue+0x5d/0x146
> [<ffffffff803f437d>] cpufreq_governor_dbs+0xa8/0x2f4
> [<ffffffff803f22ac>] __cpufreq_governor+0xa4/0x194
> [<ffffffff803f2516>] __cpufreq_set_policy+0x17a/0x1f4
> [<ffffffff803f2792>] store_scaling_governor+0x17c/0x1d8
> [<ffffffff803f35f8>] handle_update+0x0/0x28
> [<ffffffff8026650b>] _spin_unlock_irqrestore+0x15/0x30
> [<ffffffff803f1a00>] cpufreq_cpu_get+0xb5/0x143
> [<ffffffff80272dbf>] store+0x44/0x5b
> [<ffffffff802f1ddc>] sysfs_write_file+0xc5/0xf4
> [<ffffffff80215e1b>] vfs_write+0xce/0x174
> [<ffffffff802166a9>] sys_write+0x45/0x6e
> [<ffffffff8025fa8e>] system_call+0x7e/0x83

Ok, these should be fixed with Andrews voodoo patches to the workqueue code.

Dave

--
http://www.codemonkey.org.uk

2006-08-06 16:59:56

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

Hi Dave,

On 05/08/06, Dave Jones <[email protected]> wrote:
> On Fri, Aug 04, 2006 at 10:49:47PM -0400, Dave Jones wrote:
>
> This trace now makes a lot more sense to me.
>
> > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > [<c01054c2>] show_trace+0xd/0x10
> > [<c01055db>] dump_stack+0x19/0x1b
> > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > [<c029fbae>] store_scaling_governor+0x142/0x1a3
> > [<c029f1a5>] store+0x37/0x48
> > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > [<c016f99f>] vfs_write+0xab/0x157
> > [<c016ffe4>] sys_write+0x3b/0x60
> > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > cpuspeed acquired cpu_bitmask_lock
> >
> > CPU1 called lock_cpu_hotplug() for app cpuspeed. recursive_depth=0
> > [<c0104edc>] show_trace_log_lvl+0x58/0x152
> > [<c01054c2>] show_trace+0xd/0x10
> > [<c01055db>] dump_stack+0x19/0x1b
> > [<c013e8c3>] lock_cpu_hotplug+0x39/0xbf
> > [<c0132f3c>] __create_workqueue+0x52/0x122
> > [<f901234b>] cpufreq_governor_dbs+0x9f/0x2c3 [cpufreq_ondemand]
> > [<c029f7b6>] __cpufreq_governor+0x57/0xd8
> > [<c029f985>] __cpufreq_set_policy+0x14e/0x1bc
> > [<c029fbc5>] store_scaling_governor+0x159/0x1a3
> > [<c029f1a5>] store+0x37/0x48
> > [<c01a6561>] sysfs_write_file+0xab/0xd1
> > [<c016f99f>] vfs_write+0xab/0x157
> > [<c016ffe4>] sys_write+0x3b/0x60
> > [<c0103db9>] sysenter_past_esp+0x56/0x8d
> > Lukewarm IQ detected in hotplug locking
> > BUG: warning at kernel/cpu.c:46/lock_cpu_hotplug()
>
> So when we write to sysfs to set the governor, we end up in store_scaling_governor()
> which takes the hotplug lock, and then calls off into the governor to let it
> do its thing. Part of ondemand's "thing" is to create a workqueue.
> unfortunatly, __create_workqueue also takes the hotplug lock.
>
> Creating a variant of __create_workqueue that doesn't take the lock
> seems really nasty.
>
> We could remove the locking from store_scaling_governor() and make the governors
> themselves have to do the locking, but I'm not sure that's entirely safe.
>
> We could do something really disgusting like ...
>
> unlock_cpu_hotplug()
> ...
> create_workqueue()
> ...
> lock_cpu_hotplug()
>
> in ondemand, which opens up a tiny race window, but as ugly as it is,
> looks to be the best solution of the bunch right now.
>
> Comments?
>
> The really sad part is this is completely unrelated to the original bug reported
> in this thread, which shows just how widespread this braindamage is.
> Michal's traces really don't really scream anything obvious to me.
> (Though given it took me 4 hours to decode my own traces above, this is no
> real sign of how big a problem this might be).
>
> Michal, could you apply this diff.. http://lkml.org/lkml/diff/2006/8/4/381/1
> (change the '120' to '60' first), and send me the debug spew that you get ?
> You'll have to wait until a minute of uptime has passed. Oh, and edit
> include/linux/jiffies.h to change INITIAL_JIFFIES to '0'.

I hope that this one will help

BUG: using smp_processor_id() in preemptible [00000001] code: cpuspeed/1433
caller is lock_cpu_hotplug+0x25/0xc5
[<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
[<c029fa95>] store+0x37/0x48
[<c0104007>] show_trace_log_lvl+0x58/0x159
[<c0104765>] show_trace+0xd/0x10
[<c010482d>] dump_stack+0x19/0x1b
[<c01fc842>] debug_smp_processor_id+0x96/0xac
[<c01a8197>] sysfs_write_file+0xa6/0xcc
[<c013e0db>] lock_cpu_hotplug+0x25/0xc5
[<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
[<c029fa95>] store+0x37/0x48
[<c01a8197>] sysfs_write_file+0xa6/0xcc
[<c0171577>] vfs_write+0xcd/0x179
[<c0171c20>] sys_write+0x3b/0x71
[<c010318d>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
[<c0104765>] show_trace+0xd/0x10
[<c010482d>] dump_stack+0x19/0x1b
[<c01fc842>] debug_smp_processor_id+0x96/0xac
[<c013e0db>] lock_cpu_hotplug+0x25/0xc5
[<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
[<c029fa95>] store+0x37/0x48
[<c01a8197>] [<c0171577>] vfs_write+0xcd/0x179

It's from 2.6.18-rc3-mm2.

Config and dmesg log -> http://www.stardust.webpages.pl/files/mm/2.6.18-rc3-mm2/

>
> Dave
>
> --
> http://www.codemonkey.org.uk
>

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

2006-08-06 18:05:10

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sun, Aug 06, 2006 at 06:59:54PM +0200, Michal Piotrowski wrote:

> > Michal, could you apply this diff.. http://lkml.org/lkml/diff/2006/8/4/381/1
> > (change the '120' to '60' first), and send me the debug spew that you get ?
> > You'll have to wait until a minute of uptime has passed. Oh, and edit
> > include/linux/jiffies.h to change INITIAL_JIFFIES to '0'.
>
> I hope that this one will help
>
> BUG: using smp_processor_id() in preemptible [00000001] code: cpuspeed/1433
> caller is lock_cpu_hotplug+0x25/0xc5

Not really, that one seems to be a bug in my debugging patch :-/

Dave

--
http://www.codemonkey.org.uk

2006-08-06 19:33:06

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

On Sun, 6 Aug 2006 18:59:54 +0200
"Michal Piotrowski" <[email protected]> wrote:

> caller is lock_cpu_hotplug+0x25/0xc5
> [<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
> [<c029fa95>] store+0x37/0x48
> [<c0104007>] show_trace_log_lvl+0x58/0x159
> [<c0104765>] show_trace+0xd/0x10
> [<c010482d>] dump_stack+0x19/0x1b
> [<c01fc842>] debug_smp_processor_id+0x96/0xac
> [<c01a8197>] sysfs_write_file+0xa6/0xcc
> [<c013e0db>] lock_cpu_hotplug+0x25/0xc5
> [<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
> [<c029fa95>] store+0x37/0x48
> [<c01a8197>] sysfs_write_file+0xa6/0xcc
> [<c0171577>] vfs_write+0xcd/0x179
> [<c0171c20>] sys_write+0x3b/0x71
> [<c010318d>] sysenter_past_esp+0x56/0x8d
> DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
> Leftover inexact backtrace:
> [<c0104765>] show_trace+0xd/0x10
> [<c010482d>] dump_stack+0x19/0x1b
> [<c01fc842>] debug_smp_processor_id+0x96/0xac
> [<c013e0db>] lock_cpu_hotplug+0x25/0xc5
> [<fd95918f>] store_speed+0x36/0x9b [cpufreq_userspace]
> [<c029fa95>] store+0x37/0x48
> [<c01a8197>] [<c0171577>] vfs_write+0xcd/0x179
>
> It's from 2.6.18-rc3-mm2.

This "unwinder stuck" thing seems to be very common.

It's a false-positive in this case - the backtrace was complete. It would
be good if we could make the did-we-get-stuck detector a bit smarter. Even
special-casing "sysenter_past_esp" would stop a lot of this..

2006-08-07 01:26:40

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

> > [<c0171577>] vfs_write+0xcd/0x179
> > [<c0171c20>] sys_write+0x3b/0x71
> > [<c010318d>] sysenter_past_esp+0x56/0x8d
> This "unwinder stuck" thing seems to be very common.

Yes, there are still a lot of bugs in the unwind annotation
unfortunately.

We're also slowly discovering that some things we do cannot
even be expressed in CFI, so some code has to change.

>
> It's a false-positive in this case - the backtrace was complete. It would
> be good if we could make the did-we-get-stuck detector a bit smarter. Even
> special-casing "sysenter_past_esp" would stop a lot of this..

Actually it's not completely false in this case -- it should
have reached user mode and stopped there, but for some reason
I didn't and already stopped still in the kernel.

Most likely the CFI annotation for that sysenter path is not complete.

It's on my todo list to investigate but I still hope Jan does it first ;-)

-Andi


2006-08-07 08:17:47

by Jan Beulich

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

>> It's a false-positive in this case - the backtrace was complete. It would
>> be good if we could make the did-we-get-stuck detector a bit smarter. Even
>> special-casing "sysenter_past_esp" would stop a lot of this..
>
>Actually it's not completely false in this case -- it should
>have reached user mode and stopped there, but for some reason
>I didn't and already stopped still in the kernel.
>
>Most likely the CFI annotation for that sysenter path is not complete.

Correct, the return point of sysexit (SYSENTER_RETURN) is still in kernel space,
but its annotations are invisible to the unwinder. We should make the VDSO be
treated as user-mode code despite living above PAGE_OFFSET.

>It's on my todo list to investigate but I still hope Jan does it first ;-)

I'll try to, once I've got through moving the Xen code from 2.6.16 to 2.6.18-rc3

Jan

2006-08-15 12:23:03

by Jan Beulich

[permalink] [raw]
Subject: Re: 2.6.18-rc3-g3b445eea BUG: warning at /usr/src/linux-git/kernel/cpu.c:51/unlock_cpu_hotplug()

>> > [<c0171577>] vfs_write+0xcd/0x179
>> > [<c0171c20>] sys_write+0x3b/0x71
>> > [<c010318d>] sysenter_past_esp+0x56/0x8d
>> This "unwinder stuck" thing seems to be very common.
>
>Yes, there are still a lot of bugs in the unwind annotation
>unfortunately.
>
>We're also slowly discovering that some things we do cannot
>even be expressed in CFI, so some code has to change.
>
>>
>> It's a false-positive in this case - the backtrace was complete. It
would
>> be good if we could make the did-we-get-stuck detector a bit
smarter. Even
>> special-casing "sysenter_past_esp" would stop a lot of this..
>
>Actually it's not completely false in this case -- it should
>have reached user mode and stopped there, but for some reason
>I didn't and already stopped still in the kernel.
>
>Most likely the CFI annotation for that sysenter path is not
complete.

They seem to be correct, at least on the default path. I have no
problem with it finding and stopping at the first user frame; I'd like
to note, however, that this (a) is *with* the patches I sent earlier
today (they didn't change anything annotation-wise) and (b) the
size reported for sysenter_past_esp is slightly different on my
system (above trace says 0x8d, mine is 0x79), while the offsets
are 0x56 in both cases. I'd assume the size delta is due to
CONFIG_TRACE_IRQFLAGS, which then I can't see how it would
make a difference other than in the case where
trace_hardirqs_{on,off} themselves would fault (as their call
sites are un-annotated), which doesn't appear to be the case
here.

Again, if "unwinder stuck" messages appear, I'll need a raw
stack dump to accompany the stack trace.

>It's on my todo list to investigate but I still hope Jan does it first
;-)

Jan