LinuxLists.cc - [BUG][tip/master] kernel panic while locking selftest at qspinlock

2015-07-10 11:33:07

Subject: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

Hi,

I've hit a kernel panic on Locking API testsuite on kvm-qemu.

The top commit is abf9b5f800eb13e53543ff284177efb538dc68fd.
It seems there is a bug in paravirt qspinlock implementation.

To make the configuration for reproducing this bug is very simple,
make allmodconfig && make localmodconfig.

Here is the kernel message which I've recovered from kernel logbuffer by using gdb.

-----
Locking API testsuite:
----------------------------------------------------------------------------
| spin |wlock |rlock |mutex | wsem | rsem |
--------------------------------------------------------------------------
A-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
double unlock:
------------[ cut here ]------------
kernel BUG at /home/mhiramat/ksrc/linux-3/kernel/locking/qspinlock_paravirt.h:137!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc1-01639-gabf9b5f #2
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
task: ffffffff81e2b700 ti: ffffffff81e00000 task.ti: ffffffff81e00000
RIP: 0010:[<ffffffff811152bc>] [<ffffffff811152bc>] __pv_queued_spin_unlock+0xd0/0x110
RSP: 0000:ffffffff81e07e60 EFLAGS: 00010202
RAX: 0000000000000010 RBX: ffffffff828e8c50 RCX: 0000000000000100
RDX: ffff88007fe8efc0 RSI: 00000000000000ff RDI: ffffffff828e8c50
RBP: ffffffff81e07e60 R08: ffff88007fe8eec0 R09: 0000000000000100
R10: 0000000000000000 R11: 2020202020202020 R12: 0000000000000000
R13: 0000000000000001 R14: ffffffff814603b1 R15: 0000ffffffff82c8
FS: 0000000000000000(0000) GS:ffff88006ce00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000ffffffff CR3: 0000000001e22000 CR4: 00000000000006b0
Stack:
ffffffff81e07ec8 ffffffff81114a59 2020202020202020 0000000000000000
0000000000000000 0000000000000000 ffffffff828e8c50 ffffffff81c8cf83
00000000000000b8 00000000000000b8 ffffffff81117133 00000000000000b8

Call Trace:
[<ffffffff81114a59>] __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e
[<ffffffff81117133>] ? do_raw_spin_unlock+0xfa/0x10c
[<ffffffff817cd3f7>] _raw_spin_unlock+0x44/0x64
[<ffffffff814603ee>] double_unlock_spin+0x3d/0x46
[<ffffffff817c1e42>] dotest+0x85/0x16e
[<ffffffff81471431>] locking_selftest+0x67d/0x2a80
[<ffffffff82c9062a>] start_kernel+0x5bc/0x874
[<ffffffff82c8fc1d>] ? set_init_arg+0xb6/0xb6
[<ffffffff82c8f120>] ? early_idt_handler_array+0x120/0x120
[<ffffffff82c8f625>] x86_64_start_reservations+0x46/0x4f
[<ffffffff82c8f84c>] x86_64_start_kernel+0x21e/0x234
Code: 44 fe ca 75 62 eb 2d 48 ff c1 48 ff 05 de 73 c7 02 48 8d 14 01 48 21 f2 48 c1 e2 04 4c 01 c2 4c 39 c9 72 af 48 ff 05 d4 73 c7 02 <0f> 0b 48 ff 05 d3 73 c7 02 48 83 3d ab e8 d7 00 00 48 63 78 40
RIP [<ffffffff811152bc>] __pv_queued_spin_unlock+0xd0/0x110
RSP <ffffffff81e07e60>
---[ end trace ffa8b6c1f29ba3a3 ]---
Kernel panic - not syncing: Fatal exception
---[ end Kernel panic - not syncing: Fatal exception"

And here is the result of backtrace from gdb.

(gdb) bt
#0 __delay (loops=1) at /home/mhiramat/ksrc/linux-3/arch/x86/lib/delay.c:108
#1 0xffffffff8144f02a in __const_udelay (xloops=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/lib/delay.c:122
#2 0xffffffff817b6090 in panic (fmt=<optimized out>)
at /home/mhiramat/ksrc/linux-3/kernel/panic.c:201
#3 0xffffffff8101fecd in oops_end (flags=514,
regs=0xffffffff81e07db8 <init_thread_union+32184>, signr=11)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/dumpstack.c:249
#4 0xffffffff8102039f in die (str=0xffffffff81c6b82e "invalid opcode",
regs=0xffffffff81e07db8 <init_thread_union+32184>, err=0)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/dumpstack.c:316
#5 0xffffffff8101b33f in do_trap_no_signal (error_code=<optimized out>,
regs=<optimized out>, str=<optimized out>, trapnr=<optimized out>,
tsk=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:204
#6 do_trap (trapnr=<optimized out>, signr=<optimized out>,
str=0x0 <irq_stack_union>, regs=0x0 <irq_stack_union>, error_code=1,
info=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:250
#7 0xffffffff8101b8fe in do_error_trap (
regs=0xffffffff81e07db8 <init_thread_union+32184>, error_code=0,
str=0xffffffff81c6b82e "invalid opcode", trapnr=6, signr=4)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:289
#8 0xffffffff8101bf7d in do_invalid_op (regs=<optimized out>,
error_code=<optimized out>)
at /home/mhiramat/ksrc/linux-3/arch/x86/kernel/traps.c:302
#9 0xffffffff817d004e in invalid_op ()
at /home/mhiramat/ksrc/linux-3/arch/x86/entry/entry_64.S:826
#10 0x0000ffffffff82c8 in ?? ()
#11 0xffffffff814603b1 in bad_unlock_order_spin ()
at /home/mhiramat/ksrc/linux-3/lib/locking-selftest.c:532
#12 0x0000000000000001 in irq_stack_union ()
#13 0x0000000000000000 in ?? ()

--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: [email protected]

2015-07-10 13:00:44

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On Fri, Jul 10, 2015 at 08:32:46PM +0900, Masami Hiramatsu wrote:

> double unlock:
> ------------[ cut here ]------------
> kernel BUG at /home/mhiramat/ksrc/linux-3/kernel/locking/qspinlock_paravirt.h:137!

> Call Trace:
> [<ffffffff81114a59>] __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e
> [<ffffffff81117133>] ? do_raw_spin_unlock+0xfa/0x10c
> [<ffffffff817cd3f7>] _raw_spin_unlock+0x44/0x64
> [<ffffffff814603ee>] double_unlock_spin+0x3d/0x46

Cute, but somewhat expected. A double unlock really is a BUG and the PV
spinlock code cannot deal with it.

Do we want to make double unlock non-fatal unconditionally?

---
kernel/locking/qspinlock_paravirt.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 04ab18151cc8..172deeaf1311 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -286,15 +286,22 @@ __visible void __pv_queued_spin_unlock(struct qspinlock *lock)
{
struct __qspinlock *l = (void *)lock;
struct pv_node *node;
+ u8 locked;

/*
* We must not unlock if SLOW, because in that case we must first
* unhash. Otherwise it would be possible to have multiple @lock
* entries, which would be BAD.
*/
- if (likely(cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL))
+ locked = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
+ if (likely(locked == _Q_LOCKED_VAL))
return;

+#ifdef CONFIG_DEBUG_LOCKING_API_SELFTESTS
+ if (unlikely(!locked))
+ return;
+#endif
+
/*
* Since the above failed to release, this must be the SLOW path.
* Therefore start by looking up the blocked node and unhashing it.

2015-07-10 13:57:57

by Ingo Molnar

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

* Peter Zijlstra <[email protected]> wrote:

> On Fri, Jul 10, 2015 at 08:32:46PM +0900, Masami Hiramatsu wrote:
>
> > double unlock:
> > ------------[ cut here ]------------
> > kernel BUG at /home/mhiramat/ksrc/linux-3/kernel/locking/qspinlock_paravirt.h:137!
>
> > Call Trace:
> > [<ffffffff81114a59>] __raw_callee_save___pv_queued_spin_unlock+0x11/0x1e
> > [<ffffffff81117133>] ? do_raw_spin_unlock+0xfa/0x10c
> > [<ffffffff817cd3f7>] _raw_spin_unlock+0x44/0x64
> > [<ffffffff814603ee>] double_unlock_spin+0x3d/0x46
>
> Cute, but somewhat expected. A double unlock really is a BUG and the PV
> spinlock code cannot deal with it.
>
> Do we want to make double unlock non-fatal unconditionally?

No, just don't BUG() out, don't crash the system - generate a warning?

Thanks,

Ingo

2015-07-10 14:28:44

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
> * Peter Zijlstra <[email protected]> wrote:

> > Do we want to make double unlock non-fatal unconditionally?
>
> No, just don't BUG() out, don't crash the system - generate a warning?

So that would be a yes..

Something like so then? Won't this generate a splat on that locking self
test then? And upset people?

---
kernel/locking/qspinlock_paravirt.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
index 04ab18151cc8..286e8978a562 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -133,8 +133,14 @@ static struct pv_node *pv_unhash(struct qspinlock *lock)
* This guarantees a limited lookup time and is itself guaranteed by
* having the lock owner do the unhash -- IFF the unlock sees the
* SLOW flag, there MUST be a hash entry.
+ *
+ * This can trigger due to double-unlock. In which case, return a
+ * random pointer so that __pv_queued_spin_unlock() can dereference it
+ * without crashing.
*/
- BUG();
+ WARN_ON_ONCE(true);
+
+ return (struct pv_node *)this_cpu_ptr(&mcs_nodes[0]);
}

/*

2015-07-11 06:48:02

by 平松雅巳 / HIRAMATU，MASAMI

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On 2015/07/10 23:28, Peter Zijlstra wrote:
> On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
>> * Peter Zijlstra <[email protected]> wrote:
>
>>> Do we want to make double unlock non-fatal unconditionally?
>>
>> No, just don't BUG() out, don't crash the system - generate a warning?
>
> So that would be a yes..
>
> Something like so then? Won't this generate a splat on that locking self
> test then? And upset people?

Hmm, yes, this still noisy...
Can't we avoid double-unlock completely? it seems that this warning can
happen randomly, which means pv-spinlock randomly broken, doesn't it?

Thank you,

> ---
> kernel/locking/qspinlock_paravirt.h | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> index 04ab18151cc8..286e8978a562 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -133,8 +133,14 @@ static struct pv_node *pv_unhash(struct qspinlock *lock)
> * This guarantees a limited lookup time and is itself guaranteed by
> * having the lock owner do the unhash -- IFF the unlock sees the
> * SLOW flag, there MUST be a hash entry.
> + *
> + * This can trigger due to double-unlock. In which case, return a
> + * random pointer so that __pv_queued_spin_unlock() can dereference it
> + * without crashing.
> */
> - BUG();
> + WARN_ON_ONCE(true);
> +
> + return (struct pv_node *)this_cpu_ptr(&mcs_nodes[0]);
> }
>
> /*
>

--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: [email protected]

2015-07-11 06:44:43

by Waiman Long

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On 07/10/2015 08:32 PM, Masami Hiramatsu wrote:
> On 2015/07/10 23:28, Peter Zijlstra wrote:
>> On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
>>> * Peter Zijlstra<[email protected]> wrote:
>>>> Do we want to make double unlock non-fatal unconditionally?
>>> No, just don't BUG() out, don't crash the system - generate a warning?
>> So that would be a yes..
>>
>> Something like so then? Won't this generate a splat on that locking self
>> test then? And upset people?
> Hmm, yes, this still noisy...
> Can't we avoid double-unlock completely? it seems that this warning can
> happen randomly, which means pv-spinlock randomly broken, doesn't it?

It shouldn't randomly happen. The message should be printed at the first
instance of double-unlock. If that is not case, there may be some
problem in the code.

Anyway, I have an alternative fix that should better capture the problem:

-------------------------------
diff --git a/kernel/locking/qspinlock_paravirt.h
b/kernel/locking/qspinlock_paravirt.h
index 04ab181..92fc54f 100644
--- a/kernel/locking/qspinlock_paravirt.h
+++ b/kernel/locking/qspinlock_paravirt.h
@@ -286,15 +286,24 @@ __visible void __pv_queued_spin_unlock(struct
qspinlock *lock)
{
struct __qspinlock *l = (void *)lock;
struct pv_node *node;
+ u8 lockval = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);

/*
* We must not unlock if SLOW, because in that case we must first
* unhash. Otherwise it would be possible to have multiple @lock
* entries, which would be BAD.
*/
- if (likely(cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL))
+ if (likely(lockval == _Q_LOCKED_VAL))
return;

+ if (unlikely(lockval != _Q_SLOW_VAL)) {
+ printk(KERN_WARNING
+ "pvqspinlock: lock 0x%lx has corrupted value 0x%x!\n",
+ (unsigned long)lock, atomic_read(&lock->val));
+ WARN_ON_ONCE(1);
+ return;
+ }
+
/*
* Since the above failed to release, this must be the SLOW path.
* Therefore start by looking up the blocked node and unhashing it.

2015-07-11 06:47:59

by 平松雅巳 / HIRAMATU，MASAMI

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On 2015/07/11 10:27, Waiman Long wrote:
> On 07/10/2015 08:32 PM, Masami Hiramatsu wrote:
>> On 2015/07/10 23:28, Peter Zijlstra wrote:
>>> On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
>>>> * Peter Zijlstra<[email protected]> wrote:
>>>>> Do we want to make double unlock non-fatal unconditionally?
>>>> No, just don't BUG() out, don't crash the system - generate a warning?
>>> So that would be a yes..
>>>
>>> Something like so then? Won't this generate a splat on that locking self
>>> test then? And upset people?
>> Hmm, yes, this still noisy...
>> Can't we avoid double-unlock completely? it seems that this warning can
>> happen randomly, which means pv-spinlock randomly broken, doesn't it?
>
> It shouldn't randomly happen. The message should be printed at the first
> instance of double-unlock. If that is not case, there may be some
> problem in the code.

Ah, OK. That comes from locking selftest. In that case, do we really
need the warning while selftest, since we know it always fails ?

> Anyway, I have an alternative fix that should better capture the problem:

Do we need both Peter's BUG() removing patch and this?

Thank you,

> -------------------------------
> diff --git a/kernel/locking/qspinlock_paravirt.h
> b/kernel/locking/qspinlock_paravirt.h
> index 04ab181..92fc54f 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -286,15 +286,24 @@ __visible void __pv_queued_spin_unlock(struct
> qspinlock *lock)
> {
> struct __qspinlock *l = (void *)lock;
> struct pv_node *node;
> + u8 lockval = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
>
> /*
> * We must not unlock if SLOW, because in that case we must first
> * unhash. Otherwise it would be possible to have multiple @lock
> * entries, which would be BAD.
> */
> - if (likely(cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL))
> + if (likely(lockval == _Q_LOCKED_VAL))
> return;
>
> + if (unlikely(lockval != _Q_SLOW_VAL)) {
> + printk(KERN_WARNING
> + "pvqspinlock: lock 0x%lx has corrupted value 0x%x!\n",
> + (unsigned long)lock, atomic_read(&lock->val));
> + WARN_ON_ONCE(1);
> + return;
> + }
> +
> /*
> * Since the above failed to release, this must be the SLOW path.
> * Therefore start by looking up the blocked node and unhashing it.
>
>

--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: [email protected]

2015-07-11 10:22:40

by Peter Zijlstra

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On Fri, Jul 10, 2015 at 09:27:45PM -0400, Waiman Long wrote:
> Anyway, I have an alternative fix that should better capture the problem:
>
> -------------------------------
> diff --git a/kernel/locking/qspinlock_paravirt.h
> b/kernel/locking/qspinlock_paravirt.h
> index 04ab181..92fc54f 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -286,15 +286,24 @@ __visible void __pv_queued_spin_unlock(struct
> qspinlock *lock)
> {
> struct __qspinlock *l = (void *)lock;
> struct pv_node *node;
> + u8 lockval = cmpxchg(&l->locked, _Q_LOCKED_VAL, 0);
>
> /*
> * We must not unlock if SLOW, because in that case we must first
> * unhash. Otherwise it would be possible to have multiple @lock
> * entries, which would be BAD.
> */
> - if (likely(cmpxchg(&l->locked, _Q_LOCKED_VAL, 0) == _Q_LOCKED_VAL))
> + if (likely(lockval == _Q_LOCKED_VAL))
> return;
>
> + if (unlikely(lockval != _Q_SLOW_VAL)) {
> + printk(KERN_WARNING
> + "pvqspinlock: lock 0x%lx has corrupted value 0x%x!\n",
> + (unsigned long)lock, atomic_read(&lock->val));
> + WARN_ON_ONCE(1);

WARN_ONCE(1, "foo");

> + return;
> + }

Right, so since this should not ever happen in 'sane' code, its a shame
to have to put in this condition. But yes, this works too.

2015-07-11 10:27:24

by Ingo Molnar

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

* Peter Zijlstra <[email protected]> wrote:

> On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
> > * Peter Zijlstra <[email protected]> wrote:
>
> > > Do we want to make double unlock non-fatal unconditionally?
> >
> > No, just don't BUG() out, don't crash the system - generate a warning?
>
> So that would be a yes..
>
> Something like so then? Won't this generate a splat on that locking self
> test then? And upset people?
>
> ---
> kernel/locking/qspinlock_paravirt.h | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h
> index 04ab18151cc8..286e8978a562 100644
> --- a/kernel/locking/qspinlock_paravirt.h
> +++ b/kernel/locking/qspinlock_paravirt.h
> @@ -133,8 +133,14 @@ static struct pv_node *pv_unhash(struct qspinlock *lock)
> * This guarantees a limited lookup time and is itself guaranteed by
> * having the lock owner do the unhash -- IFF the unlock sees the
> * SLOW flag, there MUST be a hash entry.
> + *
> + * This can trigger due to double-unlock. In which case, return a
> + * random pointer so that __pv_queued_spin_unlock() can dereference it
> + * without crashing.
> */
> - BUG();
> + WARN_ON_ONCE(true);
> +
> + return (struct pv_node *)this_cpu_ptr(&mcs_nodes[0]);

Yeah, just please also use debug_locks_silent to make the self-test execute
properly or so.

Thanks,

Ingo

2015-07-12 03:10:05

by Waiman Long

[permalink] [raw]

Subject: Re: [BUG][tip/master] kernel panic while locking selftest at qspinlock_paravirt.h:137!

On 07/11/2015 01:05 AM, Masami Hiramatsu wrote:
> On 2015/07/11 10:27, Waiman Long wrote:
>> On 07/10/2015 08:32 PM, Masami Hiramatsu wrote:
>>> On 2015/07/10 23:28, Peter Zijlstra wrote:
>>>> On Fri, Jul 10, 2015 at 03:57:46PM +0200, Ingo Molnar wrote:
>>>>> * Peter Zijlstra<[email protected]> wrote:
>>>>>> Do we want to make double unlock non-fatal unconditionally?
>>>>> No, just don't BUG() out, don't crash the system - generate a warning?
>>>> So that would be a yes..
>>>>
>>>> Something like so then? Won't this generate a splat on that locking self
>>>> test then? And upset people?
>>> Hmm, yes, this still noisy...
>>> Can't we avoid double-unlock completely? it seems that this warning can
>>> happen randomly, which means pv-spinlock randomly broken, doesn't it?
>> It shouldn't randomly happen. The message should be printed at the first
>> instance of double-unlock. If that is not case, there may be some
>> problem in the code.
> Ah, OK. That comes from locking selftest. In that case, do we really
> need the warning while selftest, since we know it always fails ?
>
>> Anyway, I have an alternative fix that should better capture the problem:
> Do we need both Peter's BUG() removing patch and this?
>

No, you can choose either one. They are just different ways to solve the
same BUG() problem.

Cheers,
Longman