2005-09-13 07:58:37

by Keith Owens

[permalink] [raw]
Subject: 2.6.14-rc1 BUG: spinlock wrong owner on CPU#0

Booting 2.6.14-rc1 on ia64, I sometimes get

BUG: spinlock wrong owner on CPU#0, swapper/1
lock: e000003003014940, .magic: dead4ead, .owner: pdflush/75, .owner_cpu: 0

The first line is always swapper/1, the .owner varies between pdflush
and migration/1. Backtrace is

0xe0000034f7890000 1 0 1 0 R 0xe0000034f7890300 *swapper
0xa000000100012fc0 dump_stack
args (0xa0000001007f8550, 0xe000003003014940, 0xdead4ead, 0xe0000030064d02d4, 0x4b)
0xa0000001003f6650 spin_bug+0x130
args (0xe000003003014940, 0xa0000001007f85b0, 0xe0000030064d0000, 0xa0000001003f6710, 0x308)
0xa0000001003f6710 _raw_spin_unlock+0x90
args (0xe000003003014940, 0xe00000300301494c, 0xe000003003014950, 0xa000000100713ac0, 0x286)
0xa000000100713ac0 _spin_unlock_irqrestore+0x20
args (0xe000003003014940, 0x10085a2010, 0xa0000001000a2470, 0x894, 0xe0000034f7988030)
0xa0000001000a2470 try_to_wake_up+0x8f0
args (0xe0000034f7897cf0, 0x3, 0x6e, 0x6e, 0xe000003003014940)
0xa0000001000a2530 default_wake_function+0x30
args (0xe0000034f798fdc0, 0x3, 0x0, 0xa00000010009a530, 0x50e)
0xa00000010009a530 __wake_up_common+0x90
args (0xe0000034f7980050, 0x3, 0x1, 0x0, 0x0)
0xa0000001000a46b0 __wake_up+0x50
args (0xe0000034f7980038, 0x3, 0x1, 0x0, 0x10085a2010)
0xa0000001000d72c0 __queue_work+0xa0
args (0xe0000034f7980000, 0xe0000034f7897d80, 0x10085a6010, 0xa0000001000d8bd0, 0x309)
0xa0000001000d8bd0 queue_work+0xf0
args (0xe0000034f7980000, 0xe0000034f7897d80, 0x1, 0xa0000001000e2e40, 0x610)
0xa0000001000e2e40 kthread_create+0x300
args (0xe0000034f7897d48, 0xe0000034f7897d10, 0xa0000001007e1ed8, 0x10, 0xe0000034f7897e30)
0xa00000010010f1f0 start_one_pdflush_thread+0x30
args (0xa000000100876d90, 0x183, 0xe000003003400000)
0xa000000100876d90 pdflush_init+0x30
args (0xa000000100009640, 0x690, 0x0)
0xa000000100009640 init+0x3a0
args (0xa0000001008aaaa0, 0x1, 0xa0000001009dc508, 0xa0000001009dc520, 0xa0000001009dc530)
0xa000000100010b60 kernel_thread_helper+0xe0
args (0xa0000001008430d0, 0x0, 0xa000000100009120, 0x2, 0xa000000100bdb150)
0xa000000100009120 start_kernel_thread+0x20
args (0xa0000001008430d0, 0x0)

or

BUG: spinlock wrong owner on CPU#0, swapper/1
lock: e000003003014940, .magic: dead4ead, .owner: migration/1/5, .owner_cpu: 0

Call Trace:
[<a000000100012720>] show_stack+0x80/0xa0
sp=e0000034f7897c50 bsp=e0000034f7891070
[<a000000100012ff0>] dump_stack+0x30/0x60
sp=e0000034f7897e20 bsp=e0000034f7891058
[<a0000001003f6650>] spin_bug+0x130/0x160
sp=e0000034f7897e20 bsp=e0000034f7891028
[<a0000001003f6710>] _raw_spin_unlock+0x90/0x120
sp=e0000034f7897e20 bsp=e0000034f7890ff0
[<a000000100711ae0>] _spin_unlock_irqrestore+0x20/0xa0
sp=e0000034f7897e20 bsp=e0000034f7890fc8
[<a0000001000a2470>] try_to_wake_up+0x8f0/0x980
sp=e0000034f7897e20 bsp=e0000034f7890f40
[<a0000001000a25f0>] wake_up_process+0x30/0x60
sp=e0000034f7897e30 bsp=e0000034f7890f20
[<a0000001000baf60>] cpu_callback+0x1e0/0x240
sp=e0000034f7897e30 bsp=e0000034f7890ee8
[<a0000001000d0360>] notifier_call_chain+0xe0/0x140
sp=e0000034f7897e30 bsp=e0000034f7890eb0
[<a0000001000ebe30>] cpu_up+0x2d0/0x360
sp=e0000034f7897e30 bsp=e0000034f7890e58
[<a0000001000094a0>] init+0x200/0x840
sp=e0000034f7897e30 bsp=e0000034f7890de8
[<a000000100010b60>] kernel_thread_helper+0xe0/0x100
sp=e0000034f7897e30 bsp=e0000034f7890dc0
[<a000000100009120>] start_kernel_thread+0x20/0x40
sp=e0000034f7897e30 bsp=e0000034f7890dc0


And no, I am not calling curr_task() or set_curr_task() anywhere before
this trace appears ...

DEBUG_KERNEL=y
MAGIC_SYSRQ=y
LOG_BUF_SHIFT=20
DETECT_SOFTLOCKUP=y
DEBUG_SLAB=y
DEBUG_PREEMPT=y
DEBUG_SPINLOCK=y
DEBUG_SPINLOCK_SLEEP=y
DEBUG_KOBJECT=y
DEBUG_INFO=y
DEBUG_FS=y
KDB=y
KDB_MODULES=y
KDB_CONTINUE_CATASTROPHIC=0


2005-09-13 09:17:26

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.14-rc1 BUG: spinlock wrong owner on CPU#0


* Keith Owens <[email protected]> wrote:

> Booting 2.6.14-rc1 on ia64, I sometimes get
>
> BUG: spinlock wrong owner on CPU#0, swapper/1
> lock: e000003003014940, .magic: dead4ead, .owner: pdflush/75, .owner_cpu: 0

hm, ia64 uses __ARCH_WANT_UNLOCKED_CTXSW and thus it releases the
runqueue lock early - so a certain assumption in the new, improved
spinlock debugging code does not apply. Does the patch below help?

Ingo

----

fix up the runqueue lock owner only if we truly did a context-switch
with the runqueue lock held. Impacts ia64, mips, sparc64 and arm.

Signed-off-by: Ingo Molnar <[email protected]>

--- kernel/sched.c.orig
+++ kernel/sched.c
@@ -294,6 +294,10 @@ static inline void prepare_lock_switch(r

static inline void finish_lock_switch(runqueue_t *rq, task_t *prev)
{
+#ifdef CONFIG_DEBUG_SPINLOCK
+ /* this is a valid case when another task releases the spinlock */
+ rq->lock.owner = current;
+#endif
spin_unlock_irq(&rq->lock);
}

@@ -1529,10 +1533,6 @@ static inline void finish_task_switch(ru
* Manfred Spraul <[email protected]>
*/
prev_task_flags = prev->flags;
-#ifdef CONFIG_DEBUG_SPINLOCK
- /* this is a valid case when another task releases the spinlock */
- rq->lock.owner = current;
-#endif
finish_arch_switch(prev);
finish_lock_switch(rq, prev);
if (mm)

2005-09-13 09:46:28

by Keith Owens

[permalink] [raw]
Subject: Re: 2.6.14-rc1 BUG: spinlock wrong owner on CPU#0

On Tue, 13 Sep 2005 11:17:59 +0200,
Ingo Molnar <[email protected]> wrote:
>
>* Keith Owens <[email protected]> wrote:
>
>> Booting 2.6.14-rc1 on ia64, I sometimes get
>>
>> BUG: spinlock wrong owner on CPU#0, swapper/1
>> lock: e000003003014940, .magic: dead4ead, .owner: pdflush/75, .owner_cpu: 0
>
>hm, ia64 uses __ARCH_WANT_UNLOCKED_CTXSW and thus it releases the
>runqueue lock early - so a certain assumption in the new, improved
>spinlock debugging code does not apply. Does the patch below help?

Works for me, but it needs to be a -p1 patch, not -p0.